Kayvon Fatahalian - Stanford University (original) (raw)

Our group creates computing systems that enable advanced computer graphics and AI applications. Recent research efforts include:

High-performance simulation of virtual environments for "AI training". Future virtual environments will be used for generating experience for training AI agents as much or more than generating frames for video games. These agents need to learn from massive amounts of experience. So we ask the question, how do we redesign a game engine to efficiently simulate thousands of virtual environments at a throughput of millions of frames per second? The Madrona Engine and BPS3D are experiments in this direction. We also pursue humanlike video game bots.

New applications based on analyzing big video data.What new applications are possible given access to large amounts of video? We've analyzed over a decade of Cable TV news video (250,000+ hours) to understand trends in the news, and turned broadcast professional tennis video into interactive, controllable characters that look and behave like star tennis players. (Have you seen Federer play himself at Wimbledon?) Along the way we've built systems to process and query video databases at scale, improved optimizing compilers for image processing, developed new techniques for efficient DNN inference on video, and improved machine analysis of sports video.

Human-in-the-loop, interactive AI.What can a domain expert can do with an interactive supercomputer? We're interested in developing machine learning systems that put expert humans in the loop, whether that be for creative content creation workflows based on generative AI (like this or this) or for rapidly training and validating new models (here and here).

I'm always looking for students that wish to work on topics like these, or bring their own ideas.

TIPS, TALKS, AND POSTS

Here are a few tips on how to give clear research talks (or class project talks).

Experiences Building Online Events in Ohyay: I ran A LOT of virtual events during the pandemic (virtual parties, virtual classes, conferences, etc). We've built up a lot of know-how about technical and non-technical ways to make virtual events better. This talk to the Stanford HCI group brain dumps some of my observations about video-based communication and events.

On Stanford's The Future of Everything podcast I gave takes on virtual work, virtual teaching, and why the pandemic should really light a fire under the graphics community.

Like many groups, we are interested in pushing the capabilities of machine learning. I'm particularly interested in interactive, expert-in-the-loop systems that allow human intuition to interoperate with machine learning. Here's me talking about "Keeping the Domain Expert in the Loop: Ideas to Models in Hours, Not Weeks" in Stanford's MLSys seminar (Fall 2020) and at CMU's AI Seminar (March 2021).

At the Arch 2030 workshop I talked about how visual computing applications will drive architectual innovation in the year 2030. The talk is on Youtube. An updated version of the slides is here.

I created this talk, Do Grades Matter, to challenge students to think bigger than just striving to get good grades in a bunch of hard classes.

Want to get outside but still sleep in your own bed at night? Try my favorite local Bay Area hikes.

TEACHING

This quarter (Fall 2024) I am teaching CS149: Parallel Computing.
CS149: Parallel Computing (Winter 2019, Fall 2019, 2020, 2021, 2022, 2023, 2024)
CS248: Interactive Computer Graphics (2018, 2019, 2020, 2021, 2022)
CS248A: Computer Graphics: Rendering, Geometry, and Image Manipulation (2023, 2024)
CS348K: Visual Computing Systems (Winter 2018, Fall 2018, 2020, 2021, 2022, 2023, 2024)
CS348B: Image Synthesis Techniques (Spring 2022)

Before moving to Stanford, I taught the following courses at CMU.

15-418/15-618: Parallel Computer Architecture and Programming
(2012, 2013, 2014, 2015, 2016, 2017, and at Tsinghua in Summer 2017)
15-769: Visual Computing Systems (2013, 2014, 2016)
15-462/662: Computer Graphics (2015)
15-869: Graphics and Imaging Architectures (2011)

STUDENTS

DAN FU (Ph.D, co-advised with Chris Ré)
PURVI GOEL (Ph.D, co-advised with C. Karen Liu)
ZANDER MAJERCIK (Ph.D)
VISHNU SARUKKAI (Ph.D, co-advised with Chris Ré)
BRENNAN SHACKLETT (Ph.D)
WILLIAM WANG (Ph.D)

Graduated Ph.D. Students:

DAVID DURST (Ph.D, co-advised with Pat Hanrahan) (Stanford Ph.D. 2024, [Thesis])
JAMES HONG (Stanford Ph.D, 2024, [Thesis])
HAOTIAN ZHANG (Stanford Ph.D, 2023, [Thesis])
RAVI MULLAPUDI (CMU Ph.D. 2021, co-advised with Deva Ramanan, [Thesis])
EVAN SHIMIZU (CMU Ph.D. 2020, [Thesis])
YONG HE (CMU Ph.D. 2018, [Thesis])

Graduated BS/MS Students:

CHENXI LIU (CSD M.S., matriculated to Ph.D. at UBC)
KRISHNA KUMAR SINGH (CMU M.S., matriculated to Ph.D. at UC Davis)
WILL CRICHTON (CMU B.S., matriculated to Stanford Ph.D., now faculty at Brown)
KARIMA MA (CMU M.S., matriculated to Ph.D. at MIT)

PUBLICATIONS

Zhiqiang Xie, Hao Kang, Ying Sheng, Tushar Krishna, Kayvon Fatahalian, Christos Kozyrakis

MLSys 2025

Purvi Goel, Hoatian Zhang, C. Karen Liu, Kayvon Fatahalian

Eurographics 2025

Vishnu Sarukkai, Brennan Shacklett, Zander Majercik, Kush Bhatia, Christopher Ré, Kayvon Fatahalian

arXiv:2410.09187, Oct 2024

Luc Guy Rosenzweig, Brennan Shacklett, Warren Xia, Kayvon Fatahalian

SIGGRAPH Asia 2024

Vishnu Sarukkai, Lu Yuan, Mia Tang, Maneesh Agrawala, Kayvon Fatahalian

UIST 2024

David Durst, Feng Xi, Vishnu Sarukkai, Brennan Shacklett, Iuri Frosio, Chen Tessler, Joohwan Kim, Carly Taylor, Gilbert Bernstein, Sanjiban Choudhury, Pat Hanrahan, Kayvon Fatahalian

Symposium on Computer Animation (SCA) 2024

Purvi Goel, Kuan-Chieh Wang, C. Karen Liu, Kayvon Fatahalian

SIGGRAPH 2024

James Hong, Lu Yuan, Michaël Gharbi, Matthew Fisher, Kayvon Fatahalian

AAAI 2024

Vishnu Sarukkai, Linden Li, Arden Ma, Christopher Ré, Kayvon Fatahalian

Brennan Shacklett, Luc Guy Rosenzweig, Zhiqiang Xie, Bidipta Sarkar, Andrew Szot, Erik Wijmans, Vladlen Koltun, Dhruv Batra, Kayvon Fatahalian

Transactions on Graphics 2023

Hoatian Zhang, Yu Yuan, Viktor Makoviychuk, Yunrong Guo, Sanja Fidler, Xue Ben Peng, Kayvon Fatahalian

Transactions on Graphics 2023

Analysis of Faces in a Decade of US Cable TV News
James Hong, Will Crichton, Haotian Zhang, Daniel Y. Fu, Jacob Ritchie, Jeremy Barenholtz, Ben Hannel, Xinwei Yao, Michaela Murray, Geraldine Moriba, Maneesh Agrawala, Kayvon Fatahalian
KDD 2021
[Visit the Stanford Cable TV Analyzer website for more info.]

Analyzing Who and What Appears in a Decade of US Cable TV News
James Hong, Will Crichton, Haotian Zhang, Daniel Y. Fu, Jacob Ritchie, Jeremy Barenholtz, Ben Hannel, Xinwei Yao, Michaela Murray, Geraldine Moriba, Maneesh Agrawala, Kayvon Fatahalian
Paper on arXiv:2008.06007, Aug 2020
[Visit the Stanford Cable TV Analyzer website for more info.]

Multi-Resolution Weak Supervision for Sequential Data
Frederic Sala, Paroma Varma, Jason Fries, Daniel Y. Fu, Shiori Sagawa, Saelig Khattar, Ashwini Ramamoorthy, Ke Xiao, Kayvon Fatahalian, James R. Priest, Christopher Ré
NeurIPS 2019

Rekall: Specifying Video Events using Compositions of Spatiotemporal Labels
Daniel Y. Fu, Will Crichton, James Hong, Xinwei Yao, Haotian Zhang, Anh Truong, Avanika Narayan, Maneesh Agrawala, Christopher Ré, Kayvon Fatahalian
Paper on arXiv:1910.02993, Oct 2019

Learning to Optimize Halide with Tree Search and Random Programs
Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, Tzu-Mao Li, Michaël Gharbi, Benoit Steiner, Steven Johnson, Kayvon Fatahalian, Frédo Durand, Jonathan Ragan-Kelley
SIGGRAPH 2019

A Closer Look at GPUs
Kayvon Fatahalian and Mike Houston
Communications of the ACM. Vol. 51, No. 10 (October 2008)
(also published as"GPUs: A Closer Look": ACM Queue. March/April. 2008)

Sequoia: Programming the Memory Hierarchy
Kayvon Fatahalian, Timothy J. Knight, Mike Houston, Mattan Erez, Daniel R Horn, Larkhoon Leem, Ji Young Park, Manman Ren, Alex Aiken, William J. Dally, Pat Hanrahan
Supercomputing 2006

OLD PROJECTS

Slang GPU Shading Language. Slang is a shading language that extends HLSL with new capabilities for building modular, extensible, and high-performance real-time shading systems. Slang is now the shading language for NVIDIA's Falcor research rendering system. See the Slang website or the SIGGRAPH 2018 paper for more.

Self-Refining Interactive Games (graphics with 100's of machines and a lot of latency)

How do we build platforms that take graphics applications from one user on a single GPU to 10,000 machines and one million users in the cloud? Even though computer graphics has always been at the vanguard of parallel computing, there has been little success using modern cloud-based computing resources to improve interactive experiences. In this project we asked the question, how could we leverage the massive storage and batch processing capabilities of the cloud to generate new forms of interactive worlds -- and we took a "precompute everything" approach to doing so. Since one cannot precompute everything about an complex interactive world, the challenge is to determine what is_most important_ to precompute, so these parts can be presented to the user with the highest-quality graphics. We find that by recording statistics of users playing a game, we can build a model of user behavior, and then concentrate large-scale, cloud-based precomputation of graphics and physics around the states that users are most likely to encounter. The result is a_self-refining game_ whose dynamics improve with play, ultimately providing realistically rendered, rich fluid dynamics in real time on a mobile device. For more detail, see our work applied these ideas to cloth simulationand fluid simulation.

A Real-Time Micropolygon Rendering Pipeline (evolving the GPU pipeline for tiny triangles)

GPUs will soon have the compute horsepower to render scenes containing cinematic-quality surfaces in real-time. Unfortunately, if they render these subpixel polygons (micropolygons) using the same techniques as they do for large triangles today, GPUs will perform extremely inefficiently. Instead of trying to parallelize Pixar's Reyes micropolygon rendering system, we're taking a hard look at how the existing Direct3D 11 rendering pipeline, and GPU hardware implementations, must evolve to render micropolygon workloads efficiently in a high-throughput system. Changes to software interfaces, algorithms, and HW design are fair game! Slides describing what we've learned can be found in this SIGGRAPH course talk or in my dissertation: Evolving the Real-Time Graphics Pipeline for Micropolygon Rendering.

GRAMPS (a framework for heterogeneous parallel programming)

There are two ways to think about GRAMPS. Graphics folks should think of GRAMPS as a system for building custom graphics pipelines. We simply gave up on adding more and more configurable knobs to existing pipelines like OpenGL/Direct3D and instead allow the programmer to programmatically define a custom pipeline with an arbitrary number of stages connected by queues. To non-graphics folks, GRAMPS is a stream programming system that embraces heterogeneity in underlying architecture and anticipates streaming workloads that exhibit both regular and irregular (dynamic) behavior. The GRAMPS runtime dynamically schedules GRAMPS programs onto architectures containing a mixture of compute-optimized cores, generic CPU cores, and fixed-function processing units.

The Sequoia Programming Language ("Programming the Memory Hierarchy")

Sequoia is a hierarchical stream programming language that arose from the observation that expressing locality, not parallelism is the most important responsibility of parallel application programmers in scientific/numerical domains. Sequoia presents a parallel machine as an abstract hierarchy of memories and gives the programmer explicit control over data locality and communication through this hierarchy using first-class language constructs (basically, Sequoia supports nested kernels and streams of streams). Sequoia programs have run on a variety of exposed-communication architectures such as clusters, the CELL processor, GPUs, and even supercomputing clusters at Los Alamos. The best way to learn about Sequoia is to read our SC06 paper.

Brook/Merrimac (stream processing for scientific computing)

I helped out with the BrookGPU (abstracting the GPU as a stream processor for numerical computing) and Merrimac Streaming Supercomputer projects. Brook was the academic precursor to NVIDIA's CUDA.

SUPPORT

Our work has been supported by the National Science Foundation (IIS-1253530, IIS-1422767, IIS-1539069) and by INTEL, NVIDIA,QUALCOMM, GOOGLE, ADOBE, FACEBOOK, ACTIVISION, APPLE, AMAZON, THE INTERNET ARCHIVE, and THE BROWN INSTITUTE FOR MEDIA INNOVATION.