Hao (Peter) Yu (original) (raw)

IMG_3078.jpg

New York City, 2023.12

Hello, I’m Peter Yu, a MSc. (Thesis) student at McGill University and Mila. I’m supervised by Prof. David Ifeoluwa Adelani on multilingual language processing and low-resource languages. I am also collaborating with Shiwei Tong in Tencent, working on RAG and diffusion model on time series. Start from my undergraduate studies, I was supervised by Prof. Reihaneh Rabbany in detecting misinformation detection with RAG (continue working as collaborator).

Currently, my focus will be on advancing retrieval systems that adapt to human feedback. This research addresses critical challenges in current AI systems, specifically model staleness and knowledge conflicts, through unified knowledge embedding and preference-optimized knowledge distillation. Looking ahead, I aspire to develop AI systems that continuously learn and evolve by integrating human preferences and expertise, drawing inspiration from systems like Google Search which leverages user engagement as a quality signal.

Furthermore, I aim to expand beyond textual knowledge to encompass action spaces and emotional speech, transitioning from learning from humans to augmenting human capabilities. Ultimately, my goal is to develop meaningful, useful, and industry-ready products that create lasting impact.

Actively seeking Ph.D./industry opportunities in AI/NLP/ML.

Poster and Slide

🏸🏓⛰️📷

Resume: PDF

Motto: 脚踏实地 行稳致远 (Work hard and steady, and will go far)

Publications

  1. Evaluation of Retrieval-Augmented Generation: A Survey
    Jun 2024
    Retrieval-Augmented Generation (RAG) has recently gained traction in natural language processing. Numerous studies and real-world applications are leveraging its ability to enhance generative models through external information retrieval. Evaluating these RAG systems, however, poses unique challenges due to their hybrid structure and reliance on dynamic knowledge sources. To better understand these challenges, we conduct A Unified Evaluation Process of RAG (Auepora) and aim to provide a comprehensive overview of the evaluation and benchmarks of RAG systems. Specifically, we examine and compare several quantifiable metrics of the Retrieval and Generation components, such as relevance, accuracy, and faithfulness, within the current RAG benchmarks, encompassing the possible output and ground truth pairs. We then analyze the various datasets and metrics, discuss the limitations of current benchmarks, and suggest potential directions to advance the field of RAG benchmarks.
  2. Web Retrieval Agents for Evidence-Based Misinformation Detection
    Jacob-Junqi Tian, Hao Yu, Yury Orlovskiy, Mauricio Rivera, Zachary Yang, Jean-François Godbout, Reihaneh Rabbany, and Kellin Pelrine
    Jun 2024
    This paper develops an agent-based automated fact-checking approach for detecting misinformation. We demonstrate that combining a powerful LLM agent, which does not have access to the internet for searches, with an online web search agent yields better results than when each tool is used independently. Our approach is robust across multiple models, outperforming alternatives and increasing the macro F1 of misinformation detection by as much as 20 percent compared to LLMs without search. We also conduct extensive analyses on the sources our system leverages and their biases, decisions in the construction of the system like the search tool and the knowledge base, the type of evidence needed and its impact on the results, and other parts of the overall process. By combining strong performance with in-depth understanding, we hope to provide building blocks for future search-enabled misinformation mitigation systems.
  3. SWEET - Weakly Supervised Person Name Extraction for Fighting Human Trafficking
    Javin Liu*, Hao Yu*, Vidya Sujaya*, Pratheeksha Nair, Kellin Pelrine, and Reihaneh Rabbany
    In Findings of the Association for Computational Linguistics: EMNLP 2023, Dec 2023
    In this work, we propose a weak supervision pipeline SWEET: Supervise Weakly for Entity Extraction to fight Trafficking for extracting person names from noisy escort advertisements. Our method combines the simplicity of rule-matching (through antirules, i.e., negated rules) and the generalizability of large language models fine-tuned on benchmark, domain-specific and synthetic datasets, treating them as weak labels. One of the major challenges in this domain is limited labeled data. SWEET addresses this by obtaining multiple weak labels through labeling functions and effectively aggregating them. SWEET outperforms the previous supervised SOTA method for this task by 9% F1 score on domain data and better generalizes to common benchmark datasets. Furthermore, we also release HTGEN, a synthetically generated dataset of escort advertisements (built using ChatGPT) to facilitate further research within the community.