(original) (raw)
CMU Advanced Perception Seminar, Spring 1999
Table of Contents
- Class Format
- What Should be in a Critique?
- Grading Policy
- Computer Vision Resources
- Overview of topics, by week
- Week 1. Introduction and Explanation
- Week 2. Edge Extraction
- Week 3. Region/Volume Segmentation
- Week 4. Active Contours
- Week 5. Object Recognition
- Week 6. Volumetric Registration
- Week 7. Projective Geometry
- Week 8. Symmetry and Perception
- Week 9. Stabilization And Mosaicing
- Week 10. Egomotion and Structure from Motion
- Week 11. New View Synthesis
- Week 12. Range Imaging
- Week 13. Auditory Sensing
Class Format
The Advanced Perception course is a graduate reading seminar, meeting once a week to discuss a set of papers covering a specific topic in computer vision and perception. We will look at historically important papers in field, as well as current papers from recent conferences and journals. By reading a mixture of both types of papers, we will be able to trace the development of some the fundamental ideas that make up current-day research.
Each week, two papers on a particular topic will be assigned. After reading them, your must find a third paper on your own that is relevant to the topic (for example, in Week 2 you will find a paper on edge extraction, published in a conference proceedings or archival journal). Finally, you will write a short critique/essay (3-4 pages) on the topic area based on the three papers you have read. This essay will be handed in for grading. During class, each of the two assigned papers will be presented by one of the students (one student per paper, assigned the week before). This is expected to be a formal 20 minute presentation in front of the class, using transparancies. The presentation will then evolve into a class discussion on the topic covered in the paper. The instructors are responsible for keeping the discussion in a fruitful vein and making sure all students get a chance to participate. The instructors are also responsible for making sure that the important points are touched upon during the discussion, which will sometimes mean asking questions of the class, and for making sure that each paper is covered (which sometimes means cutting off discussion and moving on).
At the end of the class, we will go around the room asking each of you to cite the third paper you have personally chosen for that week, very briefly describe it (1 minute), tell us why you picked it (i.e. how does it relate to the topic area and the two assigned papers), and finally whether or not you would recommend that paper for others to read.
What Should be in a Critique?
The critiques you write will provide a short summary and analysis of the technical papers you have read each week. Critique writing is an important component of the class, and serves several goals: to give you practice in technical writing, to concretely organize your ideas in preparation for class discussion, and to develop the skills necessary to become a good conference/journal paper referee. Furthermore, getting in the habit of writing critiques of the papers you have read will help you do better research - a good critique provides a concise summary that you can refer to later without having to dig out and read the original work, and can provide a written starting point for the obligatory literature review section of your own papers/thesis. To help provide you with a sense for what goes into a critique, see the handout `The Task of the Referee,' by Alan Jay Smith, particularly the section entitled `Evaluating a Research Paper.'
We have found that it is helpful to us, when grading critiques, to have them all follow a consistent format. We ask you to hand in critiques with roughly the following sections (in this order):
- Reviewer: your name and the date
- Citation: the title, author, year, and publication citation of the three papers you are reviewing
- A one paragraph summary (abstract) of the topic area. Why is it important?
- A short overview of each paper including a) key ideas, b) technical approaches and c) results.
- Comparison of the papers, including strong points and weak points of each. How would you rank each paper relative to the others?
- Questions and issues
We will grade critiques on a three-level scale: check-minus, check, check-plus. Above average resourcefulness, initiative, creativity and depth of analysis will get a check-plus. Missing any required sections (1-6) or obvious lack of effort on any of them results in a check-minus.
Pay attention to your speling and grammar of English. :-)
Grading Policy
You will be graded on the following items:
1. Written Critiques | (40%) |
---|---|
2. Oral Presentations | (20%) |
3. Class Participation | (20%) |
4. Take-Home Final | (20%) |
5. Extra Credit | (10%) |
-------- | |
110% total |
Written critiques form the highest-weighted category, as they represent the bulk of the work that you will be performing (aside from reading the papers themselves). Each critique will be graded based on your demonstration that you know what that week's papers are about and have carefully considered their technical approaches and reported results. We are particularly interested in how well you compare and contrast the three papers that you read that week.
Oral presentation refers to the formal presentation of a paper in front of the class. Depending on class size, you will be giving roughly two-three oral paper presentations during the semester. To make it more like a real conference presentation, your talk will be strictly timed to be 20 minutes long. We suggest you carefully organize and prioritize what you want to say, and maybe even practice it once with a watch.
Class participation is rather hard to judge objectively (but we are going to try). We highly encourage you to participate in class discussion, and indeed, this type of class will be a complete failure if people don't speak up with their opinions. On the other hand, we don't wish to penalize folks who aren't naturally talkative. We will try to ensure that even soft-spoken people get a chance to air their opinions, and will attempt to grade based on the insightfulness of your comments, rather than the frequency or volume.
There will be a take-home final exam. It will involve writing!
The extra credit category will reflect both objective evidence and subjective impressions we receive that indicate you are genuinely putting in a lot of effort. Anything you do (of a professional nature, related to this class) that makes us like you better, will increase your extra credit score.
Computer Vision Resources
There are many places to go to look for computer vision papers, ranging from archival journals to on-line web sites. Here is a list of our favorite sources of material:
Archival Journals
- International Journal of Computer Vision (IJCV)
- Computer Vision and Image Understanding (CVIU)
- used to be Computer Vision, Graphics and Image Processing (CVGIP)
- IEEE Trans on Pattern Analysis and Machine Intelligence (PAMI)
- Image and Vision Computing (IVC)
- Pattern Recognition (PR)
Conference Proceedings
- International Conference on Computer Vision (ICCV)
- Computer Vision and Pattern Recognition (CVPR)
- European Conference on Computer Vision (ECCV)
- DARPA Image Understanding Workshop (IUW)
WWW Resources
- Computer Vision Homepage
http://www.cs.cmu.edu/afs/cs/project/cil/ftp/html/vision.html
Overview of Topics by Week (Selections subject to change)
Week 1: Introduction and Explanation
Introduction; explanation of class format and logistics. Instructors talk about computer vision resources, and why particular papers were selected for this course. Discussion of how to write a critique, give a presentation, and find relevant research papers.
Week 2. Feature Extraction I: Edge Extraction
(Reminder: read these two and also find a third related paper on your own.)
- E.C.Hildreth, `The Detection of Intensity Changes by Computer and Biological Vision Systems,' Computer Vision, Graphics and Image Processing, Vol. 27, 1983, pp.1-27.
- J.F.Canny. `A computational approach to edge detection.'' _IEEE Trans. on Pattern Analysis and Machine Intelligence,_Vol.8(6), November 1986, pp.679-698.
Week 2 Third Papers (selected by the students):
- C. Harris, B. Buxton. `Low-level Edge Detection Using Genetic Programming: Performance, Specificity, and Application to Real-World Signals''. June 1997, University College London Tech Report RN/97/34. -- This paper describes how genetic programming can be used to evolve a set of edge detectors specific to a training dataset. These detectors are shown to outperform both theoretical optimal detectors and other evolved detectors.
- P. Perona and J. Malik, `Scale-Space and Edge Detection Using Anisotropic Diffusion,' IEEE Trans. on PAMI, V. 12 (7), July 1990. -- This paper presents a global approach to edge detection which formulates edge detection as a diffusion process and attempts to find edges via global deformation of the image rather than local sliding-window operations. Second summary: Instead of detecting edge locally, this paper approaches the problem globally. It views the convolution with a Gaussian as similar to the solution of heat conduction/diffusion. The approach fixes many of the shortcomings of convolution-based and Canny edge detectors, however the computational cost is higher for sequential machine.
- Asada et.al., `Edge and Depth from Focus' , IJCV, 26(2), 1998, 153-163. -- Edges are extracted by observing the blurring in an image when a series of de-focussing operations is deliberately introduced.
- Y. Lu and R. C. Jain. `Reasoning about Edges in Scale Space,'' EEE Trans. on Pattern Analysis and Machine Intelligence, Vol 14(4), April 1992. -- RESS is a method of integrating edges from multiple scales of the LoG edge operator using a knowledge base of the behavior of edges at different scales.
- D. Demigny and T. Kamle, `A Discrete Expression of Cannys Criteria for Step Edge Detector Performances Evaluation,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.19(11), November 1997, pp. 1199-1211. -- Since all filters are implemented in the discrete domain, this paper proposes three criteria (similar to Canny's 3 criteria) to directly optimize filters in the discrete domain; the paper also shows that optimizing the three discrete domain criteria yields better results than what is obtained by sampling the optimized Canny filter.
- MIT AI lab memo: AI memo 773, April 1984.
- J.B. Burns, A.R.Hanson and E.M.Riseman, Extracting Straight Lines, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol8(4), July 1986, pp.425-455. -- This paper presents an approach for the extraction of straight lines in intensity images. It starts at the level of lines directly without going through the intermediate stage of first detecting local edges. They argue that this overcomes the difficulties encountered in aggregation when using local operators.
- A. A. Farag and E. J. Delp, `Edge Linking By Sequential Search,' Pattern Recognition, Vol. 28(5), 1995, pp. 611-633. -- Considering the edge detection as a two-stage process (edge enhancement followed by edge linking), more focus should be given to the edge linking process than what Canny did in his detector. The paper by Farag and Delp used Laplacian of Gaussian operator for edge enhancement and used A* (or Stack) search with mathematically sound heuristic for edge linking.
Week 3. Feature Extraction II: Region/Volume Segmentation
- T.Kapur, W.E.Grimson, W.Wells and R.Kikinis, `Segmentation of Brain Tissue from Magnetic Resonance Images , Medical Image Analysis, Vol.1(2), 1996, pp. 109-127.
- B.Maxwell and S.Shafer, `Physics-Based Segmentation of Complex Objects using Multiple Hypotheses of Image Formation,' ComputerVision and Image Understanding, Vol.65(2), Feb 1997, pp.269-295.
Week 3 Third Papers (selected by the students):
- V. Rehrmann and L. Priese, `Fast and Robust Segmentation of Natural Color Scenes'', Proceedings from Third Asian Conference on Computer Vision, Hongkong, Jan 1998. -- This paper describes the CSC algorithm, Color Structure Code, for performing real-time segmentation of color images. Images are represented with hexagonal connectivity using a hierarchical tree structure. Regions are created by color similarity comparisons of local elements, with provision for later splitting regions that prove to be dissimilar at a global level of analysis.
- M.A. Gonzalez Ballester, A. Zisserman, and J.M. Brady. `Measurement of Brain Structures based on Statistical and Geometrical 3D Segmentation,' MICCAI'98. To appear. --This paper presents a method for three-dimensional segmentation and measurement of volumetric data based on the combination of statistical and geometrical information. The shape of complex three-dimensional structures, such as the cortex is represented by combining a discrete 3D simplex mesh with the construction of a smooth surface using triangular Gregory-Bezier patches. Confidence bounds are produced for all the measurements, thus obtaining bounds on the position of the surface segmenting the image.
- T. Uchiyama and M. A. Arbib, `Color Image Segmentation Using Competitive Learning,' IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.16(12), Dec.1994, pp.1197-1206. -- This paper deals with the problem of colour image segmentation; clusters of the same colour are identified using competitive learning, thereby producing the least sum of squares solution.
- T. Leung and J. Malik, `Contour Continuity in Region-Based Image Segmentation', _Fifth Euro. Conf. on Computer Vision,_ Freiburg, Germany, June 1998. -- The paper takes into account contour continuity, in addition to intensity, color and texture to determine the partitioning of an image. The image `soft' contour is first detected by using elongated filters and Hilbert transform, giving out the `orientation energy' measure. The orientation energy is used as a basic to propagate contour. Afterward, the regions are segmented by using the normalized cut approach.
- B. Leroy, I.L. Herlin, L.D. Cohen, `Multi-Resolution Algorithms for Active Contour Models', Proceedings of the 12th International Conference on Analysis and Optimization of Systems, Images, Wavelets and PDE'S, Rocquencourt (France), 1996. S.C. Zhu, T.S. -- The paper attempts to speed up active contour models, the balloons, by going into multi-resolution using two separate methods. The first uses multi-resolution data, the second incorporate multi-resolution to the model itself (by using elliptic Fourier harmonics).
Week 4. Feature Extraction III: Active Contours
- M.Kass, A.Witkin, D.Terzopoulos, `Snakes: Active Contour Models,' International Journal of Computer Vision, Vol.1(4), January 1988, pp. 321-331.
- A.Pentland,and S.Sclaroff, `Closed-Form Solutions for Physically Based Shape Modeling and Recognition,' IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 13, no. 7, July 1991, pp. 715-729.
Week 4 Third Papers (selected by the students):
- F. Leymarie and M. Levine, `Tracking Deformable Objects in the Plane Using an Active Contour Model'', IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 15, no. 6, June 1993, pp 617--634. -- This paper suggests improvements on the original snake active contour model (Kass, Witkin, Terzopoulos): 1) a different terminating criterion to improve convergence 2) selection of bounds on parameters to prevent oscillation 3) initialization using a sequence of hierarchical discrete correlations (Burt and Adelson Laplacian pyramid). Active contours, along with the proposed modifications, are used to track the movement of cells on microscope slides.
- Lee, A.L Yuille, `Region Competition: Unifying Snakes, Region Growing, Energy/Bayes/MDL for Multi-band Image Segmentation', Proceedings of the Fifth ICCV, pp. 416-425, 1995. -- Using the statistical properties, a new region competition algorithm will have a combined best features of snakes/balloons, region growing, and Bayes/MDL. This new Region Competition algorithm allows pixels inside regions to compete for pixels along boundaries. The likelihood of membership in a region is determined using statistical properties.
- Michael Isard and Andrew Blake, `Contour tracking by stochastic propagation of conditional density', _Proc. European Conference on Computer Vision,_ vol. 1, pp. 343--356, Cambridge UK, (1996). -- The paper proposes a stochastic algorithm (Condensation algorithm) for tracking curves in dense visual cluttered images. It uses `factored sampling', a method previously applied to interpretation of static images, in which the distribution of possible interpretations is represented by a randomly generated set of representatives. The algorithm combines factored sampling with learned dynamical models to propagate an entire probability distribution for object position and shape, over time. The result is highly robust real-time tracking of agile motion in clutter. Clearly written paper with a good explanation of the proposed technique, it contains experimental results and a complexity analysis. Not surprisingly it won the best paper award.
- A. Hoover, D. Goldgof, K. W. Bowyer, `Extracting a Valid Boundary Representation from a Segmented Range Image,' _IEEE Trans. On Pattern Analysis and Machine Intelligence,_vol.17 no.9, September 1995, pp. 920-925. -- This paper addresses the problem of creating boundary representations (b-rep) of polyhedral shapes, by using topological and geometric information, and also including a hypothetical representation of the un-visible section of the object.
Week 5. Object Recognition
- D.P.Huttenlocker and S.Ullman, `Recognizing Solid Objects by Alignment with an Image,' Int'l Journal of Computer Vision, vol. 5(2), 1990, pp. 195-212.
- H.Murase, and S.K.Nayar, `Visual Learning and Recognition of 3-D Objects from Appearance,' Int'l Journal of Computer Vision, vol. 14, 1995, pp. 5-24.
Week 5 Third Papers (selected by the students):
- C.S. Chua and R. Jarvis. `Point Signatures: A New Representation for 3D Object Recognition.' International Journal of Computer Vision, 25(1), 63-85 (1997). -- A point signature is a 1D feature curve that describes the undulation of the 3D object surface local to a point of interest, a collection of which facilitates the recognition of 3D free form objects.
- P. Viola, `Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects,'' MIT AI Labs Tech Report 1591. -- This paper describes a Bayesian approach for extracting complex object features that are less affected by illumination and pose changes. Since each feature captures a greater area of a scene, the correspondence problem between model and image is reduced as well.
- C.F. Olson and D.P.Huttenlocher `Automatic Target Recognition by Matching Oriented Edge Pixels', IEEE Trans. on Image Processing, 6(1):103-113, January 1997. -- The paper defines oriented edge pixels by taking x, y, and delta (which is either the direction of the gradient, edge normal or tangent). A modified Hausdorff measure, which measures the maximum distance and orientation of nearest points, is utilized to provide a closeness measure. K number of pixels (not all) are matched, to account for occlusion and noise. The 3-D models (and multiple models) are organized in hierarchical way based on similarity (so if you have two similar models, you will create a parent having the intersection of the models). A recognition is done by computing the Hausdorff distance between the image and the models. Additionaly, a probability of a false alrm is computed by calculating Markov process, both for predicted false alarm and observed false alarm.
Week 6. Volumetric Registration
- R.Bajcsy and S.Kovacic, `Multiresolution Elastic Matching,' Computer Vision, Graphics and Pattern Recognition, Vol 46, 1989, pp.1-21.
- P.A.Viola and W.Wells, `Alignment by Maximization of Mutual Information, International Journal of Computer Vision, Vol.24(2), September 1997, pp. 137-154.
Week 7. Projective Geometry
- J.B.Burns, R.S.Weiss and E.M.Riseman, `The Non-Existence of General-Case View-Invariants,' Geometrical Invariance in Computer Vision, ed. J. Mundy and A.Zisserman, MIT Press, Cambridge, 1992, pp.120-131.
The following two papers will be treated as one, for the purposes of critiquing/presenting:
- H.C.Longuet-Higgins, `A Computer Algorithm for Reconstructing a Scene from Two Projections,' Nature, vol 293, 1981, pp. 133-135.
- R.Hartley, `In Defense of the 8-point Algorithm,' IEEE Trans on Pattern Analysis and Machine Intelligence, 19(6), June 1997, pp. 580-593.
Week 8. Symmetry and Perceptio
- F.Ulupinar and R.Nevatia, `Constraints for Interpretation of Line Drawings under Perspective Projection,' CVGIP: Image Understanding, Vol. 53(1), 1991, pp.88-96.
The following two papers will be treated as one, for the purposes of critiquing/presenting:
- H.Zabrodsky, S.Peleg and D.Avnir, `Symmetry as a Continuous Feature,' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 17(12), 1995, pp.1154-1165.
- K.Kanatani, `Comments on `Symmetry as a Continuous Feature', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 19(3), 1997, pp. 246-247.
Week 9. Stabilization And Mosaicing
- J.Bergen et.al., `Hierarchical Model-Based Motion Estimation,' in Proceedings of European Conference on Computer Vision, 1992, pp. 237-252.
- H.Shum and R.Szeliski, `Construction and Refinement of Panoramic Mosaics with Glocal and Local Alignment,' International Conference on Computer Vision, Bombay, India, Jan.1998, pp. 953-958.
Week 10. Egomotion and Structure from Motion
- C.Tomasi and T.Kanade, `Shape and Motion from Image Streams under Orthography: a Factorization Method,' Int'l Journal of Computer Vision, Vol. 9(2), 1992, pp. 137-154.
- J.L.Barron, D.J.Fleet, and S.S.Beauchemin, `Performance of Optical Flow Techniques,' Int'l Journal of Computer Vision, vol. 12, no. 1, Jan. 1994, pp. 43-77.
Week 11. New View Synthesis
- L.McMillan and G.Bishop, `Plenoptic Modeling: An Image-Based Rendering System,' Proc. SIGGRAPH, 1995, pp.39-46.
- S.Gortler, R.Grzeszczuk, R.Szeliski and M.Cohen, `The Lumigraph,' Proc. SIGGRAPH, 1996, pp.43-54.
Week 12. Range Imaging
- A.Johnson and M.Hebert, `Surface Matching for Object Recognition in Complex Three-Dimensional Scenes,' Image and Vision Computing, Vol.16, 1998, pp.635-651.
- P.Besl and N.McKay, `A Method for Registration of 3-D Shapes,' IEEE Trans on Pattern Analysis and Machine Intelligence (PAMI), Vol. 14(2), 1992, pp.239-256.
Week 13. Auditory Sensing
- To be announced