Dishani Lahiri (original) (raw)

Dishani Lahiri I am a 2nd year MS in Computer Vision student (MSCV) in the Robotics Institute at Carnegie Mellon University. I work on computer vision, natural language processing and machine learning. At CMU, I specifically work with 3D reconstruction, scene understanding, andfine-tuning large language models for personalized domain-specific usecases. I am currently advised by Prof. Kris Kitani to build a low-power visual-inertial odometry system forAria AR glasses that can be used reliably in unseen environments as well. During my summer internship at Slingshot AI, I got a chance to work in a very fast-paced environment with high code-quality standards which enriched my research, software engineering, and product skills. I worked on optimizing fine-tuning of personalized text-to-image models (you can see my results on the home page) , improving the results for Generative aging models, and fine-tuning LLaMA2-7B for personalized text style transfer (paper coming soon). I developed an interest in diffusion models and currently aim to work on text-to-video models. Previously I worked on impactful and profitable projects at Samsung R&D Institute, Bangalore. At Samsung, I was a key innovator for the development and deployment of AI Night mode in Samsung Flagship series and the Expert RAW application. I completed my undergraduate studies in ECE from DTU in 2019. My Bachelor's thesis on Neural Caption Generator was advised by Prof. S. Indu, ex-Head of Department, ECE, DTU. Owing to my interest in human activity recognition, I also worked withProf. D.K. Vishwakarma. Email / CV / Bio / Google Scholar / LinkedIn / Github profile photo

Projects & Publications

I'm interested in computer vision, natural language processing, and machine learning, especially in building personalized multi-modal solutions for edge devices.

S2RF: Semantically Stylized Radiance Fields Dishani Lahiri*,Neeraj Panse*,Moneish Kumar* ICCV, 2023 Workshop on AI for 3D Content Creation paper |code webpage We present our method for transferring style from any arbitrary image(s) to object(s) within a 3D scene. Our primary objective is to offer more control in 3D scene stylization, facilitating the creation of customizable and stylized scene images from arbitrary viewpoints. To achieve this, we propose a novel approach that incorporates nearest neighborhood-based loss, allowing for flexible 3D scene reconstruction while effectively capturing intricate style details and ensuring multi-view consistency.
Abnormal human action recognition using average energy images Dishani Lahiri*,Chhavi Dhiman,Dinesh Kumar Vishwakarma IEEE, 2017 Conference on Information and Communication Technology (CICT) paper We propose a solution to detect abnormal human actions in the image using Histogram of Oriented Gradients (HoG) as the feature descriptor, Principal Component Analysis (PCA) as the dimensionality-reduction technique, and Support Vector Machine as the ML tool for classification. We also release a dataset for abnormal human activities of fainting, headache, and chest pain.

Teaching Experience

Awards and Recognition