Jan Ackermann | Research & Engineering (original) (raw)
News
Mar 2026 Our paper "Policy-based Foveated Imaging and Perception" is accepted to SIGGRAPH Conference 2026!
Feb 2026 Our paper "BulletTime: Decoupled Control of Time and Camera Pose for Video Generation" is accepted to CVPR 2026!
Dec 2025 We released two datasets and reports on information seeking behavior in AI agents.
Jun 2025 Our paper "CL-Splats: Continual Learning of Gaussian Splatting with Local Optimization" is accepted to ICCV 2025!
May 2025 We will present a short-paper on pre-training transfer for digital garments at the MMFM workshop at CVPR 2025!
Feb 2025 Our paper "AIpparel: A Large Multimodal Foundation Model for Digital Garments" is accepted to CVPR 2025!
Sep 2024 Started my visit at Stanford University. See you there!
Asymmetric Flow Modeling
Hansheng Chen, Jan Ackermann, Minseo Kim, Gordon Wetzstein, Leonidas Guibas
In Submission
TL;DR: We propose AsymFlow, a rank-asymmetric velocity parameterization that predicts noise in a low-rank subspace while preserving full-dimensional data prediction. This analytically recovers full-dimensional velocity without changing architecture or training, setting new state-of-the-art performance for pixel-space image generation and enabling effective finetuning from pretrained latent flow models.
Policy-based Foveated Imaging and Perception
SIGGRAPH Conference, 2026
TL;DR: We introduce a real-time, task-aware foveated imaging system that learns a policy to allocate scarce sensor bandwidth to the most informative regions while preserving low-resolution global context. This closes the perception-acquisition loop at capture time and delivers strong perception performance under strict pixel budgets in both simulation and real-world 200MP sensor experiments.
GeoFlow: Enforcing Implicit Geometric Consistency in Video Generation
Jan Ackermann, Shengqu Cai*, Boyang Deng*, Zhengfei Kuang*, Songyou Peng, Gordon Wetzstein
In Submission
TL;DR: We introduce a geometry-consistency reward that enforces physically plausible motion in generated videos. By decomposing flow into rigid camera motion and independent object motion, we turn geometric coherence into an explicit optimization objective via reinforcement fine-tuning, substantially reducing temporal artifacts while preserving quality.
BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
Yiming Wang, Qihang Zhang*, Shengqu Cai*, Tong Wu†, Jan Ackermann†, Zhengfei Kuang†, Yang Zheng†, Frano Rajic†, Siyu Tang, Gordon Wetzstein
Computer Vision and Pattern Recognition (CVPR), 2026
TL;DR: We propose a novel approach to video generation that decouples the control of time and camera pose. We introduce a new dataset and model that allows for the generation of videos with both temporal and spatial condition.
CL-Splats: Continual Learning of Gaussian Splatting with Local Optimization
Jan Ackermann, Jonas Kulhanek, Shengqu Cai, Haofei Xu, Marc Pollefeys, Gordon Wetzstein, Leonidas Guibas, Songyou Peng
International Conference on Computer Vision (ICCV), 2025
TL;DR: We explore how to adapt an existing 3DGS scene representation to new inceremental changes. We propose a novel and efficient way to identify changed regions and then to locally optimize them. This not only produces more accurate scene updates but also enables new applications.
AIpparel: A Multimodal Foundation Model for Digital Garments
Kiyohiro Nakayama*, Jan Ackermann*, Timur L. Kesdogan*, Yang Zheng, Maria Korosteleva, Olga Sorkine-Hornung, Leonidas Guibas, Guandao Yang, Gordon Wetzstein
Computer Vision and Pattern Recognition (CVPR), 2025 Highlight
TL;DR: We introduce AIpparel, the first large-scale multimodal generative model designed specifically for digital garments. By extending LLaVA to incorporate a new garment modality, AIpparel enables the creation of sewing patterns from text, image, and garment inputs.
Do Efficient Transformers Really Save Computation?
Kai Yang, Jan Ackermann, Zhenyu He, Guhao Feng, Bohang Zhang, Yunzhen Feng, Qiwei Ye, Di He, Liwei Wang
International Conference on Machine Learning (ICML), 2024
TL;DR: We explore the class of Linear and Sparse Transformers in a Chain-of-Thought (CoT) setting, finding that to match the performance of regular Transformers, their hidden dimensions must scale with the problem size.
Maskomaly: Zero-shot Mask Anomaly Segmentation
Jan Ackermann, Christos Sakaridis, Fisher Yu
British Machine Vision Conference (BMVC), 2023 Oral
TL;DR: We show that pretrained Mask-based segmentation models can predict anomalies without further tuning. Additionally, we introduce a metric for anomaly segmentation that favors models with confident predictions.
Mentored Students
I have had the pleasure of working with the following students:
- Howard Xiao (2025-2026): PhD Student at Stanford University
- Emir Can (2025): MSc Student at Stanford University
- Yiming Wang (2025): Direct Doctorate Student at ETH Zurich