Perception Through Structured Generative Models @ ECCV 2020 (original) (raw)

Overview

The highly structured nature of the visual world has inspired many computer vision researchers to pursue an analysis-by-synthesis approach: to understand an image, one should be able to reproduce it with a model. A good model should also be able to extrapolate into unseen space or time: given a 2D or 2.5D image of a partially occluded object, what is the full 3D extent? Given a fragment of a video, how does the remainder play out? Generative models of images, video, and 3D data have made great strides in recent years, but their utility as causal or interpretable models has not always advanced in step. For example, while GANs can currently generate beautiful images, they do not necessarily learn a latent space of graphics-like or semantically-interpretable elements. In this workshop, we aim to explore how generative models can facilitate perception, and in particular, how to design and use structured generative models (of images, video, and 3D data) for computer vision inference applications.

Recordings of Invited Talks

Accepted papers

Vadim Sushko, Edgar Schönfeld, Dan Zhang, Juergen Gall, Bernt Schiele, Anna Khoreva. "3D Noise and Adversarial Supervision Is All You Need for Multi-Modal Semantic Image Synthesis". PDF link.
Sarthak Bhagat, Vishaal Udandarao, Shagun Uppal, Saket Anand. "DisCont: Self-Supervised Visual Attribute Disentanglement using Context Vectors". PDF link.
Weiyu Du, Oleh Rybkin, Lingzhi Zhang, Jianbo Shi. "Toward Continuous-Time Representations of Human Motion". PDF link.

Call for papers

We are soliciting original contributions in computer vision, robotics, and machine learning relating to the following topics:

Inverse graphics
Generative models for images, video, 3D data
Reconstruction or prediction as objectives for representation learning
Learning disentangled and/or interpretable representations
Novel methods for structured generative modelling
Generation for prediction, anomaly detection, compression, search, etc.
Managing and leveraging visual stochasticity
Incorporating hierarchy and graphics-like elements into machine learning
Causal and forward models of visual data

Submission deadline: August 20. We encourage authors to submit earlier, anytime between August 1 and August 20, to help us spread out the reviewing work.

Submit your paper to our OpenReview site, using the ECCV 2020 "final copy" latex kit.

Submissions should be 4 pages long, including references. The 4-page limit helps eliminate dual-submission conflicts with ECCV and other conferences. (E.g., even papers accepted to ECCV may be dual-submitted here, provided that they are shortened to 4 pages.)

The workshop organizers will review the papers in a single-blind fashion. All accepted papers will be included in a poster presentation session. Accepted papers will be published in the proceedings.

Program schedule

Morning/afternoon session

09:00 Pittsburgh (EST) / 14:00 London (UTC+1)

Max Welling

Combining Generative and Discriminative Models

09:40 Pittsburgh (EST) / 14:40 London (UTC+1)

J. Kevin O'Regan

Thinking about vision in a different way: the world as an outside memory

10:20 Pittsburgh (EST) / 15:20 London UTC+1)

Peter Battaglia

Structured understanding and interaction with the world

11:00 Pittsburgh (EST) / 16:00 London UTC+1)

Poster session 1

Afternoon/evening session

17:00 Pittsburgh (EST) / 22:00 London (UTC+1)

Sanja Fidler

AI for 3D Content Generation

17:40 Pittsburgh (EST) / 22:40 London (UTC+1)

Ruslan Salakhutdinov

Geometric Capsule Autoencoders for 3D Point Clouds

18:20 Pittsburgh (EST) / 23:20 London (UTC+1)

Carl Vondrick

Data and Task Generalization

19:00 Pittsburgh (EST) / 00:00 London UTC+1)

Poster session 2

Access the zoom links through the ECCV virtual platform.

Organizers

Contact: Adam Harley (aharley@cmu.edu).

Website theme based on Scene Graph Representation and Learning.