awinml - Overview (original) (raw)

Ashwin Mathur

AI Engineer · Agentic RAG & Reranking · LLM Fine-Tuning & RL · Domain-Specific AI

LinkedInEmail

I work on LLM systems for domain-specific applications in Finance, Bio-Medical, and Legal AI, spanning retrieval, agents and model training. I've contributed to Haystack, MTEB, HuggingFace, and scikit-learn, and co-authored MMTEB, published at ICLR 2025. Developing open-source AI at AVNLP.

Developing Open-Source AI @ AVNLP

LLM Training & RL Alignment

Repository Description
BioThink Self-Reflective Bio-Medical QA training with QLoRA + GRPO to generate structured self-reflection tokens using six reward functions; evaluated across seven metrics via LLM-as-a-Judge.
RAG Model Training Fine-tuning LLMs for Adaptive-RAG, Corrective RAG, RQ-RAG, Self-RAG, Agentic RAG, and ReZero via SFT and GRPO across finance, biomedical, and open-domain QA.
GRPO Four GRPO implementations comparing format/correctness rewards, DeepSpeed vs. PyTorch training, frozen/server/periodic reference models, and vLLM vs. Transformers rollout generation.
LLM Finetuning SFT, DPO, KTO, ORPO, PPO, and GRPO pipelines with QLoRA/LoRA/DoRA/P-Tuning/Prefix-Tuning adapter training across ARC, FactScore, TriviaQA, PopQA, Earnings Calls, and GSM8K.

Retrieval Augmented Generation and Agents

Repository Description
RAG Pipelines Domain-specific RAG pipelines combining LangGraph orchestration, BAML structured generation, Milvus Hybrid Search, 3-layer metadata enrichment, and instruction-following rerankers for Medical and Financial QA.
DSPy Optimizers DSPy RAG optimization with Weaviate Hybrid Search, Query Rewriting, Sub-Query Decomposition using MIPROv2/COPRO/BootstrapFewShot optimizers on FreshQA, HotpotQA, TriviaQA, and PubMedQA.
VectorDB Haystack and LangChain retrieval pipelines spanning Dense/Sparse/Hybrid search, Reranking, Parent-Child Retrieval, Query Enhancement, and Multi-Tenancy across Pinecone, Weaviate, Milvus, Qdrant, and Chroma.

Information Retrieval & Ranking

Repository Description
LLM Rankers LLM rankers using Pairwise, Setwise, and Listwise techniques with RankZephyr/RankLlama, Pydantic-validated structured generation, and efficient zero-shot sorting.
Pairwise Ranking Prompting Zero-shot pairwise reranking with All-Pairs, Heapsort, and Sliding-K strategies, using bidirectional comparison for position-bias mitigation and Pydantic-validated outputs.
Reciprocal Rank Fusion and LLM Rankers Hybrid retrieval combining Reciprocal Rank Fusion with Diversity, Lost-in-the-Middle, and Similarity rankers, evaluated on BEIR (NDCG, MAP, Recall, Precision).
LLM Blender LLM ensembling framework using PairRanker for cross-attention candidate ranking and GenFuser for top-K output fusion, packaged as a Haystack component.

Open-Source Contributions

Publications

MMTEB: Massive Multilingual Text Embedding Benchmark (ICLR 2025)

Largest multilingual text embedding benchmark: 500+ tasks across 250+ languages and 10 task categories. Contributed the complete LegalBench suite - 160+ legal domain classification and retrieval datasets.