CSE 639 (original) (raw)
CIS 639
Statistical Approaches to Natural Language Processing
Spring 2002
Syllabus and Overview
Time: Tues-Thurs 1:30-3:00pm
Place: Moore 224
Instructor: Mitch Marcus
Office: Moore461a
Phone: 215-898-2538
Prerequisites: CIS 530 - Computational Linguistics
This course will extend the introduction to Statistical NLP begun as part of CIS530.� It is intended to give participants sufficient background to allow independent reading and understanding of the current research literature and to allow the execution of intermediate-level research projects in Statistical NLP.�
The course this year will focus on standard and recent statistical methods applied to three problems in grammatical processing: Part of Speech tagging, NP chunking, and grammatical parsing.�Methods investigated will include Hidden Markov Models, Maximum Entropy, probabilistic CFGs and other generative statistical models, Support Vector Machines, Memory Based Learning, and voting methods� (Brill learning will be reviewed.).
The class will interleave three modes:
- Lectures on the contents of Section III of� Manning & Sch�tze, Foundations of Statistical Natural Language Processing,�
- �Student-led discussions of recent papers on NP Chunking from the group of papers to be found
- at Erik Tjong Kim Sang�s web site on NP Chunking and the methods behind them including
- A Tutorial on Support Vector Machines for Pattern Recognition - Burges), and
- IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms - Daelemans, van den Bosch, Weijters).
- Group discussion of the details of maximum entropy and generative probabilistic models for statistical NLP included in Michael Collin's Ph.D. dissertation and Adwait Ratnaparkhi's Ph.D. dissertation.�
Required work will include leading a discussion of selected papers, a final paper or course project, and two or three exercises during the semester.
(The syllabus for CIS 639 for Spring 2000 can be found here.)