Jun-Yan Zhu's Homepage (original) (raw)

| | | | --------------------------------------------------------------------------------------------- | |

I am the Michael B. Donohue Assistant Professor of Computer Science and Robotics in the School of Computer Science of Carnegie Mellon University. I also hold affiliated faculty appointments in the Computer Science Department and Machine Learning Department. I study computer vision, graphics, computational photography, and generative models.

Prior to joining CMU, I was a Research Scientist at Adobe Research. I did a postdoc at MIT CSAIL, working with William T. Freeman, Josh Tenenbaum, and Antonio Torralba. I obtained my Ph.D. from UC Berkeley, under the supervision of Alexei A. Efros. I received my B.E. from Tsinghua University, working with Zhuowen Tu, Shi-Min Hu, and Eric Chang.

Code & Events

Copyright and compensation issues in generative models: GenDataAttribution and concept-ablation (see CMU**News** and the Quartz article).
Check out SVDQuant repo for 4-bit diffusion models with faster inference and less memory. It supports FLUX, FLUX.1-Tools, and img2img-turbo.
E-latent LPIPS code has been released. It can compute the perceptual loss between two latents for many models (e.g., FLUX, SD 1.5/2.1/XL/3)
Check out img2img-turbo repo for pix2pix-turbo and CycleGAN-turbo: one-step image translation for both paired and unpaired settings.
Modelverse platform for helping everyone share, discover, and study generative models more easily.
Image-to-image translation repos: CycleGAN-and-pix2pix, pix2pixHD, BicycleGAN, vid2vid, GauGAN/SPADE, CUT.
Image editing with diffusion models: SDEdit (used in Stable Diffusion Image-to-Image), pix2pix-zero, and Rich-Text-to-Image.
Model customization and editing: concept-ablation, custom-diffusion, domain-expanision, model-rewriting, GANSketching, and GANWarping.
Image editing repos and demos: iGAN (GAN inversion), GANPaint, pix2latent, sam_inversion, SwappingAutoencoder, and interactive-deep-colorization.
Neural tactile synthesis: Tactile DreamFusion, visual-tactile-synthesis, VisGel, and scalable tactile glove.
GANs training and evaluation libraries: Vision-aided GANs (pip install vision-aided-loss), DiffAugment, and clean-fid (pip install clean-fid).
Synthetic data for computer vision: dataset-distillation, mtt-distillation, GLaD, CyCADA, and gan-ensembling.
3D synthesis code: Total-Recon, BlendNeRF, pix2pix3D, Depth-supervised NeRF, Editing NeRF, Visual Object Networks, and 3D scene de-rendering.
Network visualization tools: GANDissect, GANSeeing, and Network Dissect.
Efficient generative models: SIGE (for SDEdit w/ Stable Diffusion and GauGAN), gan-compression (for cGANs) and anycost-gan (for GANs).
CVPR 2025 AI for Content Creation Workshop.
CatPapers: Cool vision, learning, and graphics papers on Cats.

Generative Intelligence Lab

Our lab studies the collaboration between Human Creators and Generative Models, with the goal of building intelligent machines capable of helping everyone tell their visual stories. We are studying the following questions:

Interaction between creators and generative models: How can we help creators control the model outputs more easily? We develop algorithms and interfaces for controllable visual synthesis (e.g., images, videos, 3D, visual+tactile)
Rewriting and searching generative models: How can creators repurpose existing models for new tasks, concepts, and styles? How could they rewrite the rules of models? Which model shall they use as a starting point?
Co-existence of creators and generative models: How can we allow creators to opt in or out of generative models at any time? If opting in, how do we credit creators for contributing training data?
Synthetic data generation with generative models: How can we use generative models to produce useful data for improving computer vision and robotics systems?

Our lab is part of Carnegie Mellon Graphics Lab and Carnegie Mellon Computer Vision Group.

Former members and visitors: Sean Liu (Postdoc, now Research Scientist at Autodesk Research), Richa Mishra (MSCV, now at HeyGen), Aniruddha Mahapatra (MSCV, now at Adobe), Or Patashnik (Visiting PhD from TAU), Songwei Ge (Visiting PhD from UMD), Chonghyuk (Andrew) Song (MSR, now PhD student at MIT), Muyang Li (MSR, now PhD student at MIT), Daohan (Fred) Lu (MSCV, now PhD student at NYU),Mia Tang (Undergrad, now MS student at Stanford), Bingliang Zhang (Undergrad, now PhD student at Caltech), Rohan Agarwal (MSCV, now at Runway ML), George Cazenavette (MSR, now PhD student at MIT)

Teaching

16-726: Learning-based Image Synthesis (Spring 2024, Spring 2023, Spring 2022, Spring 2021)
16-824: Visual Learning and Recognition (Fall 2023, Fall 2022, Fall 2021)
Deep Learning at Udacity (Co-instructor)

Software

	Landscape Mixer Photoshop 2022's Landscape Mixer can transform landscape images in various ways. This feature is based on our work Swapping Autoencoder (NeurIPS 2020). Web \| Video

	NVIDIA Canvas: Turn Simple Brushstrokes into Realistic Images Download Windows 10 app based on our work SPADE (CVPR 2019) and GauGAN demo (SIGGRAPH 2019). Web \| Video

	Photoshop Neural Filters Photoshop 2021 introduces "Neural Filters". Several features are partly built on our work iGAN (ECCV 2016), ideepcolor (SIGGRAPH 2017), and CycleGAN (ICCV 2017). Web \| Video

Selected Publications

See the full list on Google Scholar

	Holistic Evaluation of Text-To-Image Models Tony Lee, Michihiro Yasunaga, Chenlin Meng, Yifan Mai, Joon Sung Park, Agrim Gupta, Yunzhi Zhang, Deepak Narayanan, Hannah Benita Teufel, Marco Bellagente,Minguk Kang, Taesung Park, Jure Leskovec, Jun-Yan Zhu, Li Fei-Fei, Jiajun Wu, Stefano Ermon,Percy Liang NeurIPS 2023 Paper* \| Project	Code

	State of the Art on Neural Rendering Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih,Matthias Nießner, Rohit Pandey, Sean Fanello, Gordon Wetzstein, Jun-Yan Zhu, Christian Theobalt, Maneesh Agrawala, Eli Shechtman, Dan B Goldman,Michael Zollhöfer Eurographics 2020 (STAR Report) Paper \| Project	BibTex CVPR Tutorial	Eurographics Tutorial

	Learning to Synthesize and Manipulate Natural Images December, 2017 ACM SIGGRAPH Outstanding Doctoral Dissertation Award. David J. Sakrison Memorial Prize for outstanding doctoral research, by the UC Berkeley EECS Dept. Thesis \| Talk	News	Cover

Talks

Learning to Generate Images

SIGGRAPH Dissertation Award Talk (2018)

Unpaired Image-to-Image Translation

CVPR Tutorial on GANs (2018)

Learning to Synthesize and Manipulate Natural Photos

MIT, HKUST CSE Departmental Seminar, ICCV Tutorial on GANs, O'Reilly AI, AI with the best, Y Conf, DEVIEW, ODSC West (2017)

On Image-to-Image Translation

Stanford, MIT, Facebook, CUHK, SNU (2017)

Interactive Deep Colorization

SIGGRAPH, NVIDIA Innovation Theater, Global AI Hackathon (2017)

Visual Manipulation and Synthesis on the Natural Image Manifold

Facebook, MSR, Berkeley BAIR, THU, ICML workshop "Visualization for Deep Learning" (2016)

Mirror Mirror: Crowdsourcing Better Portraits

SIGGRAPH Asia (2014)

What Makes Big Visual Data Hard?

SIGGRAPH Asia invited course "Data-Driven Visual Computing" (2014)

AverageExplorer: Interactive Exploration and Alignment of Visual Data Collections

SIGGRAPH (2014)

Past Events

SIGGRAPH 2023 Course on Diffisuion Models.
CVPR 2022 Workshop on AI for Content Creation.
CVPR 2022 Workshop on Sketch-Oriented Deep Learning.
SIGGRAPH 2021 Course on the Advances in Neural Rendering.
SIGGRAPH 2021 Workshop on Measurable Creative AI.
CVPR 2021 Workshop on Computational Measurements of Machine Creativity.
CVPR 2020 Tutorial on Neural Rendering.
Eurographics 2020 STAR Tutorial on Neural Rendering.
IJCV Special Issue on Generative Adversarial Networks for Computer Vision (2019-2020).
ICCV 2019 Workshop on Image and Video Synthesis.
CVPR 2019 Tutorial on Map Synchronization.
CVPR 2018 Tutorial on Generative Adversarial Networks.
ICCV 2017 Tutorial on Generative Adversarial Networks.
ICML 2017 Workshop on Visualization for Deep Learning.
SIGGRAPH Asia 2014 Course on Data-Driven Visual Computing.

MISC

My cat Aquarius and my dog Arya's photo and its Ukiyo-e style.
You can eat Greek yogurt when wearing a mask. See how Arya did it.
Our new dog Stark was enjoying Pittsburgh's snow.