Joseph Spisak - Meta | LinkedIn (original) (raw)
Activity
Experience & Education
Meta
View Joseph’s full experience
See their title, tenure and more.
Licenses & Certifications
Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization.
deeplearning.ai
Publications
The 1st IEEE International Conference on Big Data Computing Service and Applications March 15, 2015
With the pervasiveness of MapReduce - one of the most prominent programming models for data parallelism in Apache Hadoop-, many researchers and developers have spent tremendous effort attempting to boost the computational speed and energy efficiency of MapReduce-based big data processing. However, the scalable and fault-tolerant nature of MapReduce introduces additional costs in disk IO and data transfer, caused by streaming intermediate outputs to disk. In light of these issues, many…
With the pervasiveness of MapReduce - one of the most prominent programming models for data parallelism in Apache Hadoop-, many researchers and developers have spent tremendous effort attempting to boost the computational speed and energy efficiency of MapReduce-based big data processing. However, the scalable and fault-tolerant nature of MapReduce introduces additional costs in disk IO and data transfer, caused by streaming intermediate outputs to disk. In light of these issues, many interesting research projects have been initiated with the goal of improving the compute speed and power efficiency of compute-intensive cloud computing workloads, several with the addition of discrete GPUs. In this work, we present a modified MapReduce approach focused on the iterative clustering algorithms in the Apache Mahout machine learning library that leverage the acceleration potential of the Intel integrated GPU in a multi-node cluster environment. The accelerated framework shows varying levels of speed-up (≈45x for Map tasks-only, ≈4.37x for the entire K-means clustering) as evaluated using the HiBench benchmark suite. Based on various experiments and in-depth analysis, we find that utilizing the integrated GPU via OpenCL offers significant performance and power efficiency gains over the original CPU based approach. Further analysis is also done to understand the correlations between compute, IO and power efficiency. As such, our results show that embracing the integrated GPU in the Hadoop MapReduce framework represents a promising advance in adding cost and energy efficient compute parallelism to a data parallel multinode environment.
Other authors
Honors & Awards
Intel’s Technology Innovation in Video Encoding and Transcoding for Media Servers and the Data Center
Frost and Sullivan
Cooperative Technical Leadership Program | GE Industrial Systems
General Electric Inc.
1999
Completed 3 rotations through engineering, marketing and manufacturing.
Languages
Japanese
Limited working proficiency
English
Native or bilingual proficiency
More activity by Joseph
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Others named Joseph Spisak in United States
16 others named Joseph Spisak in United States are on LinkedIn
See others named Joseph Spisak