Yash Bhalgat (original) (raw)

यश भळगट | Yash Bhalgat Final year PhD candidate in the Visual Geometry Group (VGG), Oxford. Co-advised by Andrew Zisserman, Andrea Vedaldi, João Henriques and Iro Laina. Funded by the EPSRC+AWS fellowship with AIMS CDT. Research Interests: 3D Computer Vision: Contrastive-Lift, Light-Touch,DIR-Net Vision + Language: N2F2, 3D-LLMs Efficient Machine Learning: LSQ+,QKD, StructConv, LTP Parallelly, I also work as an AI consultant for an on-device AI startup and a LLM content moderation company. Before, I was a Senior Researcher at Qualcomm AI Research. I have also been fortunate to spend time at Voxel51, IBM Research (Bangalore and Almaden Lab), IFPEN (Paris), TCS Research. Education: Masters in Computer Science from the University of Michigan Bachelors in Electrical Engineering (CS minor) from IIT Bombay Feel free to setup a call if you want to discuss ideas around startups, AI (CV / ML / LLMs), or want to collaborate. Email / CV / Scholar / Github / LinkedIn / X profile photo

News

3D-Aware Instance Segmentation and Tracking in Egocentric Videos Yash Bhalgat*,Vadim Tschernezki*,Iro Laina,João Henriques,Andrea Vedaldi Andrew Zisserman, ACCV, 2024We propose a 3D-aware method for object tracking in long egocentric videos, leveraging scene geometry to handle rapid motion and occlusions. Our approach improves tracking accuracy, reduces ID switches by up to 80%, and enables applications like 3D object reconstruction and amodal segmentation.
N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields Yash Bhalgat,Iro Laina,João Henriques,Andrew Zisserman,Andrea Vedaldi ECCV, 2024We present Nested Neural Feature Fields (N2F2), a hierarchical approach to 3D scene understanding that encodes multi-scale properties in a unified feature field. Our method outperforms state-of-the-art approaches like LERF and LangSplat on open-vocabulary 3D tasks, especially for complex queries, while enabling faster inference.
Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion [Code] Yash Bhalgat,Iro Laina,João Henriques,Andrew Zisserman,Andrea Vedaldi NeurIPS, 2023 (Spotlight presentation) We present a novel "slow-fast" contrastive fusion method to lift 2D predictions to 3D for scalable instance segmentation, achieving significant improvements without requiring an upper bound on the number of objects in the scene.
A Light Touch Approach to Teaching Transformers Multi-view Geometry Yash Bhalgat,João Henriques,Andrew Zisserman CVPR, 2023An "Epipolar-guided training" method to incorporate multi-view geometric priors into Transformer models, which can be implemented in 150 lines of code. During test-time, the Transformer implicitly estimates the epipolar geometry given 2 images and uses it for downstream predictions, e.g. for pose-invariant retrieval.
Dynamic Iterative Refinement for Efficient 3D Hand Pose Estimation John Yang,Yash Bhalgat,Simyung Chang,Fatih Porikli,Nojun Kwak WACV, 2022We propose a tiny deep network of which partial layers are recursively exploited for refining its previous estimations. During its iterative refinements, we employ learned gating criteria to decide whether to exit from the weight-sharing loop, allowing per-sample adaptation in our model. We also predict and exploit uncertainty estimations in the gating mechanism.
Structured Convolutions for Efficient Neural Network Design Yash Bhalgat,Yizhe Zhang,Jamie Lin,Fatih Porikli NeurIPS, 2020We introduce a neat trick to enable the execution of convolution operations in the form of efficient, scaled, sum-pooling components. We present a Structural Regularization loss that enables this decomposition with negligible performance loss. Our method is competitive with other tensor decomposition and structured pruning methods.
Data-driven Weight Initialization with Sylvester Solvers Debasmit Das,Yash Bhalgat,Fatih Porikli Practical Machine Learning for Developing Countries Workshop, ICLR, 2021We propose a data-driven scheme to initialize the parameters of a neural network. The initialization is cast as an optimization problem, which is restructured into the well-known Sylvester equation that has fast and efficient gradient-free solutions. We show that our proposed method is especially effective in few-shot and fine-tuning settings.
LSQ+: Improving low-bit quantization through learnable offsets and better initialization Yash Bhalgat,Jinwon Lee,Markus Nagel,Tijmen Blankevoort,Nojun Kwak Efficient Deep Learning in Computer Vision Workshop, CVPR, 2020We introduce a general asymmetric quantization scheme with trainable scale and offset parameters. LSQ+ shows SOTA results for EfficientNet and MixNet outperforming LSQ for low-bit quantization.
Learned Threshold Pruning Kambiz Azarian,Yash Bhalgat,Jinwon Lee,Tijmen Blankevoort arxiv, 2020We propose an end-to-end differentiable method for learning layerwise pruning thresholds which results in SOTA model compression ratios with AlexNet, ResNet and EfficientNet. Our method also generates a trail of checkpoints with different accuracy-efficiency operating points.
QKD: Quantization-aware Knowledge Distillation for Low-bit Quantization Yash Bhalgat*,Jangho Kim*,Jinwon Lee,Chirag Patel,Nojun Kwak arxiv, 2020Low-bit quantization and KD often don't go well together, but both are important approaches to reduces a model's memory footprint. We propose an effective method to combine these two methods and show results that outperform all existing quantization/KD approaches.
Teacher-Student Learning Paradigm for Tri-training: An Efficient Method for Unlabeled Data Exploitation Yash Bhalgat,Zhe Liu,Pritam Gundecha,Jalal Mahmud,Amita Misra KONVENS, 2019Teacher-student tri-training is a method for semi-supervised learning using 3 classifiers working using adaptive teacher and student thresholds.
Annotation-cost Minimization for Medical Image Segmentation using Suggestive Mixed Supervision Fully Convolutional Networks Yash Bhalgat*,Meet Shah* Suyash Awate Medical Imaging meets NeurIPS workshop, 2019For Medical Image segmentation, we present a budget-based cost-minimization framework in a mixed-supervision setting via dense segmentations, bounding boxes, and landmarks.
CATSEYES: Categorizing Seismic structures with tessellated scattering wavelet networks Yash Bhalgat,Jean Charlety,Laurent Duval ICASSP, 2018We use Scattering Wavelets transforms to extract sparse feature sets from seismic data. We show that using this method combined with simple PCA-based feature selection leads to promising classification performance in affordable computation time.

Invited Talks

Hall of Fame

Website template borrowed from here.