CSEP 517 - Natural Language Processing (original) (raw)
Instructor: Yejin Choi (yejin at cs dot washington dot edu) Office hours: TBD at CSE 578 (and by appointment) | TA: Ignacio Cano (icano at cs dot washington dot edu) Office hours: TBD at CSE 370 (and by appointment) TA: James Ferguson (jfferg at cs dot washington dot edu) Office hours: TBD at CSE 410-5 (and by appointment) |
---|
Schedule (subject to change)
Week | Dates | Topics & Lecture Slides | Notes (Required) | Textbook | Supplementary Readings |
---|---|---|---|---|---|
1 | Oct 6 | Introduction [Slides]; Language Models (LM) [Slides] | LM Notes | J&M 4.1-4; M&S 6 | [Large LMs] [Berkeley LM] |
2 | Oct 13 | Sequences: Language Models and Smoothing; Hidden Markov Models (HMMs) [Slides] | HMM Notes | J&M 4.5-7; M&S 6 | [Smoothing] |
3 | Oct 20 | Hidden Markov Models (HMMs) [Slides]& Part-Of-Speech Tagging [Slides] | J&M 5.1-5.3; 6.1-6.4; M&S 9, 10.1-10.3 | [TnT Tagger] [Stanford Tagger] [SOTA POS] | |
4 | Oct 27 | Trees: Probabilistic Context Free Grammars (PCFG) and Parsing [Slides] | PCFG Notes, Lexicalized PCFGs | J&M 13-14; M&S 11-12 | [Syntax Intro] [Incremental] [Best First] [A* Parsing] [Lexicalized] [Unlexicalized] [Split Merge] |
5 | Nov 3 | More Parsing [Slides]; Expectation Maximization (EM)[Slides] | EM Notes, Forward-backward, Inside-outside | J&M 6.5; M&S 9.3-4; 11.3-4 | [Semi-supervised Naive Bayes] [EM Tutorial] [EM for Feature-Rich] |
6 | Nov 10 | Machine Translation (MT): Word Alignment [Slides] | IBM Models 1 and 2 | J&M 25.1-6; M&S 13 | [IBM Models] [HMM Model] [MERT Training] |
7 | Nov 17 | Log-Linear / Feature-Rich Models: Conditional Random Fields (CRFs) [Slides-nov17] | Log-linear models CRF Notes | J&M 6.6-6.8; M&S 16.2-16.3 | [MaxExt] [CRF Tutorial] [CRF LM] [CRF Parsing] |
8 | Nov 24 | More Machine Translation (MT): Phrase-based MT [Slides]; Syntax-based MT [Slides I] [II] | Phrase-based Notes | J&M 25.6-10; M&S 13 | [SCFG Tutorial] [Hiero] [Tree-to-String] [Tree-to-Tree] |
9 | Dec 1 | Knowledge & Semantic Relations: Information Extraction; Entailment; [Slides] | J&M 22 | [Entailment Graphs] [Paraphrasing w/ MT] [Paraphrasing and Entailment] | |
10 | Dec 8 | Neural Models |
Textbooks
- Recommended: D. Jurafsky & James H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, Prentice Hall, Second Edition, 2009. (J&M)
- Optional: C.D. Manning & H. Schuetze, Foundations of Statistical Natural Language Processing, Cambridge: MIT Press, 1999 (available online, free if accessed from UW computers) (M&S)
Contact
- Please feel free to email the course staff, addresses above, and come to office hours. Let us know if you need to meet outside of the scheduled hours, we will do our best to accomodate.
- We also have a GoPost discussion board. Please consider posting your questions there; everyone will benefit. We also encourage you to try to answer questions, which will count as class participation. We will monitor daily and contribute as long as the boards are being used.
- Grades: Assignment grades are posted in the online CSEP 517 Gradebook. Please let us know if you see any errors.
Homeworks
We will have 4 programming-based homework assignments (80% of grade). Data/code/instruction are linked at Dropbox
- Assignment 1: Language Models (Due Oct 22th Thu 11pm)
- Assignment 2: HMMs (Due Nov 5th Thu 11pm)
- Assignment 3 PCFGs: (Due Nov 23rd Mon 11pm)
- Assignment 4 MT: (Due Dec 13 Sun 11pm) Please submit all your assignments to the online DropBox.
Final Mini-project
Students may replace one homework project with a final mini-project (20% of grade). While students must work individually for the homework projects, students are encouraged to work as a group for the final mini-project.
- Final report (Due Dec 13 Sun, 11pm)
Grading
The final grade will consist of programming-based homeworks (80%), non-programming assignments (10%) and course/discussion board participation (10%). No midterm or final exam.
Course Administration and Policies
- Assignments must be done individually unless otherwise specified. You may discuss the subject matter with other students in the class, but all final answers must be your own work. You are expected to maintain the utmost level of academic integrity in the course.
- Each assignment may be handed in up to three days late, at a penalty of 20% of the maximum grade per day. You have 5 panelty-free late day credits that you can use at any time during the quarter. Above 20% substraction will apply only after you have used all your late day credits. Being late by a partial day (e.g., 1 hour) will be rounded up to 1 full day. This late day policy does not apply to the final project submission due to tight grading schedule at the end of the quarter.
- Comments can be sent to the instructor or TA using this anonymous feedback form.
![]() |
Department of Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX |