Hai Pham (original) (raw)

Hai Pham

Hai Pham

Ph.D. Student, Language Technologies Institute

School of Computer Science, Carnegie Mellon University

I have graduated and joined the wonderful and super talented team at Reka AI. We are hiring from the best to join us!

I was fortunate to be advised by Prof. Barnabás Póczos and Prof. David Woodruff . I have broad interests in Machine Learning and Deep Learning, with theory and applications in Natural Language Processing and Computer Vision. I am also interested in Optimization, Large-Scale Machine Learning systems and Numerical Methods.

Prior to starting my Ph.D., I received my Masters in Language Technologies at the same department. Before that, I had graduated in Computer Science in Auburn University with a focus on Big Data and Distributed Systems.

I worked as a Research Intern at at Boeing in 2017, Microsoft in Summer 2021 and 2022, and Applied Scientist Intern at Amazon AWS in Fall 2022.

Email: hai [at] reka [dot] ai

Preprints

Publications

            @article{hpham23thesis,
                title={Towards Efficient and Scalable Representation Learning},
                author={Pham, Hai},
                journal={Ph.D. Thesis, School of Computer Science, Carnegie Mellon University},
              year={2023},
                url={Thesis_final.pdf},
            }

            @inproceedings{woodruff2021optimal,
                title={Optimal Sketching for Trace Estimation},
                author={Jiang, with Shuli and Woodruff, David P and Zhang, Richard},
                booktitle={Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS) (SPOTLIGHT)},
              year={2021},
                url={https://arxiv.org/abs/2111.00664.pdf},
                code={https://github.com/11hifish/OptSketchTraceEst}
            }

            @inproceedings{lyu2021styleptb,
                title={StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer},
                author={Lyu*, Yiwei and Liang*, Paul Pu and Pham*, Hai and Hovy, Eduard and Póczos, Barnabás  and Salakhutdinov, Ruslan and Morency, Louis-Philippe},
                booktitle={Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
                year={2021},
                url={https://arxiv.org/pdf/2104.05196.pdf},
                code={https://github.com/lvyiwei1/StylePTB/}
            }


            @inproceedings{pham2020robust,
              title={Robust Handwriting Recognition with Limited and Noisy Data},
              author={Pham, Hai and Setlur, Amrith and Dingliwal, Saket and Lin, Tzu-Hsiang and Póczos, Barnabás  and Huang, Kang and Li, Zhuo and Lim, Jae and McCormack, Collin and Vu, Tam},
              booktitle={2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR)},
              pages={301--306},
              year={2020},
              organization={IEEE},
              url={https://arxiv.org/pdf/2008.08148.pdf}
            }

            @inproceedings{hoang2020revisiting,
                title={Revisiting the Sample Complexity of Sparse Spectrum Approximation of Gaussian Processes},
                author={Hoang, with Quang Minh and Hoang, Trong Nghia and Woodruff, David P},
                booktitle={Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS)},
                year={2020},
                url={https://arxiv.org/pdf/2011.08432.pdf},
                code={https://github.com/hqminh/gp_sketch_nips}
            }

            @inproceedings{pham2019found,
                title={Found in translation: Learning robust joint representations by cyclic translations between modalities},
                author={Pham*, Hai and Liang*, Paul Pu and Manzini, Thomas and Morency, Louis-Philippe and Póczos, Barnabás },
                booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
                volume={33},
                number={01},
                pages={6892--6899},
                year={2019},
                url={https://arxiv.org/pdf/1812.07809.pdf},
                code={https://github.com/hainow/MCTN}
            }

            @inproceedings{pham2018seq2seq2sentiment,
                title={Seq2seq2sentiment: Multimodal sequence to sequence models for sentiment analysis},
                author={Pham, Hai and Manzini, Thomas and Liang, Paul Pu and Póczos, Barnabás },
                booktitle={Proceedings of Grand Challenge and Workshop on Human Multimodal Language (Challenge-HML)},
                year={2018},
                url={https://arxiv.org/pdf/1807.03915.pdf}
            }

            @inproceedings{zhou2015sfmapreduce,
                title={Sfmapreduce: An optimized mapreduce framework for small files},
                author={Zhou, Fang and Pham, Hai and Yue, Jianhui and Zou, Hao and Yu, Weikuan},
                booktitle={2015 IEEE International Conference on Networking, Architecture and Storage (NAS)},
                pages={23--32},
                year={2015},
                organization={IEEE},
                url={https://www.cs.fsu.edu/~yuw/pubs/2015-NAS-Yu.pdf}
            }

            @article{pham2016assessment,
                title={Assessment of Multiple Ingest Strategies for Accumulo Key-Value Store},
                author={Pham, Hai},
                year={2016},
                journal={Master's Thesis, Computer Science, Auburn University},
                url={https://etd.auburn.edu/bitstream/handle/10415/5135/hpham%20-%20Grad%20Thesis%20-%20final.pdf?sequence=2&isAllowed=y}
            }

        ### Teaching