CS 294-5 Object Recognition (original) (raw)
CS 294-6 Recognizing People, Objects and Actions
Jitendra Malik
Spring 2004
405 Soda Hall
Tu 4-6
Course Content
This course will be designed around the challenge problem of making
computers aware of the everyday visual world i.e. process images or
video to be able to recognize categories such as cars, buses, tigers,
zebras, rooms, doors, telephones, faces, arms and hands as well as
actions such as running, jumping and kicking. Topics will include a
survey of human visual recognition: perception and physiology,
recognition in the presence of transformations, local matching
techniques, global matching techniques, segmentation as a front end,
motion descriptors for action recognition, as well as case studies of
recognition in different domains. I have a specific list of about 300
visual categories to focus our thoughts.
Lecture Topics
- Introduction: Characteristics of visual recognition. Prototypes and affordances. Basic, Superordinate and subordinate categories (reference: Palmer, Chapter 9)
- Human visual system
- Basic computations in retina, LGN, V1, V2
- Models of receptive fields-center-surround, oriented, simple/complex
- Cortical magnification factor, log-polar mapping
- Five approaches to handwritten digit recognition
- Template matching using distance transform variants
- Chamfer distance
- Hausdorff distance
- Discussion of transformations in general
- D'Arcy Thompson, Fischler and Elschlager, Grenander
- Similarity and Affine transforms
- Smooth diffeomorphisms, Thin Plate Splines
- Local scale-invariant keypoint features
- Pose estimation, pose clustering, geometric hashing, basis views
- Multiple view approaches to 3D objects - aspects, k-medoids
- Perceptual Organization - Grouping, figure/ground
- The Human Body
- Human Movement
- Scenes.
- Project presentations.
There is no required text for this course. Steve Palmer's Vision Science and Forsyth and Ponce's Computer vision: A Modern Approach have useful source material.
We will use a scribe system to make course notes available through the semester. Each lecture, one or two students will take turns taking notes and typing them up. I'll edit and make the notes available on the web.
The grade will be determined by a combination of home assignments, scribe notes, and a final project. The project could be the mathematical/statistical analysis of a visual task or the implementation of some interesting algorithm or some psychophysical experiment.
You'll be encouraged to work in teams for the projects and for the home assignments.
I hope you enjoy the course!