Computer Vision : Fall 2024 (original) (raw)

Computer Vision (CMU 16-385)

This course provides a comprehensive introduction to computer vision. Major topics include image processing, detection and recognition, geometry-based and physics-based vision and video analysis. Students will learn basic concepts of computer vision as well as hands on experience to solve real-life vision problems.

Basic Info

Mon/Wed 11:00am-12:20pm

Tepper 1403

See the Course Info page for more info on policies and logistics.

Getting Started

To get started with the class you need to do just three things:

Sign up for the course Piazza.
Sign up for an account on this webpage. (The signup code is on Canvas.)
Carefully read through the Course Info.

Fall 2024 Schedule

Aug 26 (Mon)	Course Introduction
Aug 28 (Wed)	Image Filtering
Sep 2 (Mon)	NO CLASS (Labor Day) Assignment 1 out
Sep 4 (Wed)	Image Pyramids and Frequency Domain
Sep 9 (Mon)	Hough Transform
Sep 11 (Wed)	Detecting Corners
Sep 16 (Mon)	Feature Detectors and Descriptors
Sep 18 (Wed)	2D Transformations Assignment 1 DUE Assignment 2 out
Sep 23 (Mon)	Image Homographies
Sep 25 (Wed)	Geometric Camera Models
Sep 30 (Mon)	Geometric Camera Models (cont.)
Oct 2 (Wed)	Two-View Geometry Assignment 2 DUE Assignment 3 out
Oct 7 (Mon)	Stereo
Oct 9 (Wed)	Image Classification
Oct 14 (Mon)	NO CLASS (Spring Break)
Oct 16 (Wed)	NO CLASS (Spring Break)
Oct 21 (Mon)	Image Classification (cont.)
Oct 23 (Wed)	Neural Networks Assignment 3 DUE Assignment 4 out
Oct 28 (Mon)	Neural Networks (cont.)
Oct 30 (Wed)	Convolutional Neural Networks
Nov 4 (Mon)	Optical Flow
Nov 6 (Wed)	Alignment and Tracking Assignment 4 DUE Assignment 5 out
Nov 11 (Mon)	Alignment and Tracking (cont.)
Nov 13 (Wed)	Deep Tracking
Nov 18 (Mon)	Deep Optical Flow / Deep Stereo; Generative Adversarial Networks (GANs)
Nov 20 (Wed)	Generative Adversarial Networks (GANs) - Part 2 Assignment 5 DUE Assignment 6 out
Nov 25 (Mon)	Variational AutoEncoders (VAEs) / Diffusion / Societal Impacts
Nov 27 (Wed)	NO CLASS (Thanksgiving)
Dec 2 (Mon)	Advanced architectures: Batch Norm, ResNet, RNN, Transformer
Dec 4 (Wed)	Deep 3D Point Cloud Processing
Dec 6 (Fri)	Assignment 6 DUE

Assignments

Assignments will be released via Piazza. A list of assignments is available below. Reference material is available on the Lectures page.

(Due Sep 18)	Programming Assignment 1: Image Filtering and Hough Transform

(Due Oct 2)	Programming Assignment 2: Augmented Reality with Planar Homographies

(Due Oct 23)	Programming Assignment 3: 3D Reconstruction

(Due Nov 6)	Programming Assignment 4: Scene Recognition with Bag of Words

(Due Nov 20)	Programming Assignment 5: Neural Networks for Recognition

(Due Dec 6)	Programming Assignment 6: Video Tracking

Acknowledgments

The lecture notes have been pieced together from many different people and places. Special thanks to colleagues for sharing their slides: Matt O'Toole, Kris Kitani, Bob Collins, Srinivasa Narashiman, Martial Hebert, Alyosha Efros, Ali Faharadi, Deva Ramanan, Yaser Sheikh, and Todd Zickler. Many thanks also to the following people for making their lecture notes and materials available online: Steve Seitz, Richard Selinsky, Larry Zitnick, Noah Snavely, Lana Lazebnik, Kristen Grauman, Yung-Yu Chuang, Tinne Tuytelaars, Fei-Fei Li, Antonio Torralba, Rob Fergus, David Claus, and Dan Jurafsky.