Joseph Spisak - Meta | LinkedIn (original) (raw)

Activity

Experience & Education

View Joseph’s full experience

See their title, tenure and more.

Licenses & Certifications

Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization.

deeplearning.ai

Publications

The 1st IEEE International Conference on Big Data Computing Service and Applications March 15, 2015

With the pervasiveness of MapReduce - one of the most prominent programming models for data parallelism in Apache Hadoop-, many researchers and developers have spent tremendous effort attempting to boost the computational speed and energy efficiency of MapReduce-based big data processing. However, the scalable and fault-tolerant nature of MapReduce introduces additional costs in disk IO and data transfer, caused by streaming intermediate outputs to disk. In light of these issues, many…
With the pervasiveness of MapReduce - one of the most prominent programming models for data parallelism in Apache Hadoop-, many researchers and developers have spent tremendous effort attempting to boost the computational speed and energy efficiency of MapReduce-based big data processing. However, the scalable and fault-tolerant nature of MapReduce introduces additional costs in disk IO and data transfer, caused by streaming intermediate outputs to disk. In light of these issues, many interesting research projects have been initiated with the goal of improving the compute speed and power efficiency of compute-intensive cloud computing workloads, several with the addition of discrete GPUs. In this work, we present a modified MapReduce approach focused on the iterative clustering algorithms in the Apache Mahout machine learning library that leverage the acceleration potential of the Intel integrated GPU in a multi-node cluster environment. The accelerated framework shows varying levels of speed-up (≈45x for Map tasks-only, ≈4.37x for the entire K-means clustering) as evaluated using the HiBench benchmark suite. Based on various experiments and in-depth analysis, we find that utilizing the integrated GPU via OpenCL offers significant performance and power efficiency gains over the original CPU based approach. Further analysis is also done to understand the correlations between compute, IO and power efficiency. As such, our results show that embracing the integrated GPU in the Hadoop MapReduce framework represents a promising advance in adding cost and energy efficient compute parallelism to a data parallel multinode environment.
Other authors

Honors & Awards

Frost and Sullivan

Mar 2015
http://ww2.frost.com/news/press-releases/frost-sullivan-lauds-intels-technology-innovation-video-encoding-and-transcoding-media-servers-and-data-center/

General Electric Inc.

1999
Completed 3 rotations through engineering, marketing and manufacturing.

Languages

Limited working proficiency

Native or bilingual proficiency

More activity by Joseph

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Joseph Spisak in United States

16 others named Joseph Spisak in United States are on LinkedIn

See others named Joseph Spisak

Add new skills with these courses