Stanford CS 224N | Natural Language Processing with Deep Learning (original) (raw)

Natural language processing (NLP) is a crucial part of artificial intelligence (AI), modeling how people share information. In recent years, deep learning approaches have obtained very high performance on many NLP tasks. In this course, students gain a thorough introduction to cutting-edge neural networks for NLP.

Content

What is this course about?

Natural language processing (NLP) or computational linguistics is one of the most important technologies of the information age. Applications of NLP are everywhere because people communicate almost everything in language: web search, advertising, emails, customer service, language translation, virtual agents, medical reports, politics, etc. In the 2010s, deep learning (or neural network) approaches obtained very high performance across many different NLP tasks, using single end-to-end neural models that did not require traditional, task-specific feature engineering. In the 2020s amazing further progress was made through the scaling of Large Language Models, such as ChatGPT. In this course, students will gain a thorough introduction to both the basics of Deep Learning for NLP and the latest cutting-edge research on Large Language Models (LLMs). Through lectures, assignments and a final project, students will learn the necessary skills to design, implement, and understand their own neural network models, using the Pytorch framework.

“Take it. CS221 taught me algorithms. CS229 taught me math. CS224N taught me how to write machine learning models.” – A CS224N student on Carta

Previous offerings

Below you can find archived websites and student project reports from previous years. Disclaimer: assignments change from year to year; please do not do assignments from previous years!

Prerequisites

Reference Texts

The following texts are useful, but none are required. All of them can be read free online.

If you have no background in neural networks but would like to take the course anyway, you might well find one of these books helpful to give you more background:

Coursework

Disclaimer: Coursework is tentative and subject to change!

Assignments (48%)

There are four weekly assignments, which will improve both your theoretical understanding and your practical skills. All assignments contain both written questions and programming parts. In office hours, TAs may look at students’ code for assignments 1 and 2, but not for assignments 3 and 4.

Final Project (49%)

The Final Project offers you the chance to apply your newly acquired skills towards an in-depth application. Students have two options: the Default Final Project (in which students tackle a predefined task, namely implementing a minimalist version of GPT-2) or a Custom Final Project (in which students choose their own project involving human language and deep learning). Examples of both can be seen on last year's website. Note: TAs may not look at students' code for either the default or custom final projects.

Important information

Practicalities

Participation (3%)

We appreciate everyone being actively involved in the class! There are several ways of earning participation credit, which is capped at 3%:

Late Days

Regrade Requests

If you feel you deserved a better grade on an assignment, you may submit a regrade request on Gradescope within 3 days after the grades are released. Your request should briefly summarize why you feel the original grade was unfair. Your TA will reevaluate your assignment as soon as possible, and then issue a decision. If you are still not happy, you can ask for your assignment to be regraded by an instructor.

Credit/No credit enrollment

If you take the class credit/no credit then you are graded in the same way as those registered for a letter grade. The only difference is that, providing you reach a C- standard in your work, it will simply be graded as CR.

All students welcome

We are committed to doing what we can to work for equity and to create an inclusive learning environment that actively values the diversity of backgrounds, identities, and experiences of everyone in CS224N. We also know that we will sometimes make missteps. If you notice some way that we could do better, we hope that you will let someone in the course staff know about it.

Well-Being and Mental Health

If you are experiencing personal, academic, or relationship problems and would like to talk to someone with training and experience, reach out to the Counseling and Psychological Services (CAPS) on campus. CAPS is the university’s counseling center dedicated to student mental health and wellbeing. Phone assessment appointments can be made at CAPS by calling 650-723-3785, or by accessing the VadenPatient portal through the Vaden website.

Auditing the course

In general we are happy to have auditors if they are a member of the Stanford community (registered student, official visitor, staff, or faculty). If you are interested, email us at cs224n-win2425-staff@lists.stanford.edu. If you want to actually master the material of the class, we very strongly recommend that auditors do all the assignments. However, due to high enrollment, we cannot grade the work of any students who are not officially enrolled in the class.

Students with Documented Disabilities

We assume that all of us learn in different ways, and that the organization of the course must accommodate each student differently. We are committed to ensuring the full participation of all enrolled students in this class. If you need an academic accommodation based on a disability, you should initiate the request with the Office of Accessible Education (OAE). The OAE will evaluate the request, recommend accommodations, and prepare a letter for faculty. Students should contact the OAE as soon as possible and at any rate in advance of assignment deadlines, since timely notice is needed to coordinate accommodations. Students should also send your accommodation letter to either the staff mailing list (cs224n-win2425-staff@lists.stanford.edu) or make a private post on Ed, as soon as possible.

OAE accommodations for group projects: OAE accommodations will not be extended to collaborative assignments.

AI Tools Policy

Students are required to independently submit their solutions for CS224N homework assignments. Collaboration with generative AI tools such as Co-Pilot and ChatGPT is allowed, treating them as collaborators in the problem-solving process. However, the direct solicitation of answers or copying solutions, whether from peers or external sources, is strictly prohibited.

Employing AI tools to substantially complete assignments or exams will be considered a violation of the Honor Code. For additional details, please refer to the Generative AI Policy Guidance here.

Sexual violence

Academic accommodations are available for students who have experienced or are recovering from sexual violence. If you would like to talk to a confidential resource, you can schedule a meeting with the Confidential Support Team or call their 24/7 hotline at: 650-725-9955. Counseling and Psychological Services also offers confidential counseling services. Non-confidential resources include the Title IX Office, for investigation and accommodations, and the SARA Office, for healing programs. Students can also speak directly with the teaching staff to arrange accommodations. Note that university employees – including professors and TAs – are required to report what they know about incidents of sexual or relationship violence, stalking and sexual harassment to the Title IX Office. Students can learn more at [https://vaden.stanford.edu/sexual-assault](are recovering from).

Schedule

Updated lecture slides will be posted here shortly before each lecture. Other links contain last year's slides, which are mostly similar.

Lecture notes will be uploaded a few days after most lectures. The notes (which cover approximately the first half of the course content) give supplementary detail beyond the lectures.

Disclaimer: Schedule is tentative and subject to change!
Disclaimer: Assignments change; please do not do old assignments. We will give no points for doing last year's assignments.

Date Description Course Materials Events Deadlines
Week 1Tue Jan 7 Word Vectors [slides] [notes] Suggested Readings: Efficient Estimation of Word Representations in Vector Space (original word2vec paper) Distributed Representations of Words and Phrases and their Compositionality (negative sampling paper) Assignment 1 out [code]
Thu Jan 9 Word Vectors and Language Models [slides] [notes] [code] Suggested Readings: GloVe: Global Vectors for Word Representation (original GloVe paper) Improving Distributional Similarity with Lessons Learned from Word Embeddings Evaluation methods for unsupervised word embeddings Additional Readings: A Latent Variable Model Approach to PMI-based Word Embeddings Linear Algebraic Structure of Word Senses, with Applications to Polysemy On the Dimensionality of Word Embedding
Fri Jan 10 Python Review Session [[slides](slides%5Fw25/2024 CS224N Python Review Session Slides.pptx.pdf)] [colab] Time 1:30pm-2:20pm Location Gates B01
Week 2Tue Jan 14 Backpropagation and Neural Network Basics [slides] [notes] Suggested Readings: matrix calculus notes Review of differential calculus CS231n notes on network architectures CS231n notes on backprop Derivatives, Backpropagation, and Vectorization Learning Representations by Backpropagating Errors (seminal Rumelhart et al. backpropagation paper) Additional Readings: Yes you should understand backprop Natural Language Processing (Almost) from Scratch Assignment 2 out [code] [handout] [latex template] Assignment 1 due
Thu Jan 16 Dependency Parsing [slides] [notes] Suggested Readings: Incrementality in Deterministic Dependency Parsing A Fast and Accurate Dependency Parser using Neural Networks Dependency Parsing Globally Normalized Transition-Based Neural Networks Universal Stanford Dependencies: A cross-linguistic typology Universal Dependencies website Jurafsky & Martin Chapter 19
Fri Jan 17 PyTorch Tutorial Session[colab] Time 1:30pm-2:20pm Location Gates B01
Week 3Tue Jan 21 Basic Sequence Models to RNNs [slides] [notes (lectures 5 and 6)] Suggested Readings: N-gram Language Models (textbook chapter) The Unreasonable Effectiveness of Recurrent Neural Networks (blog post overview) Sequence Modeling: Recurrent and Recursive Neural Nets (Sections 10.1 and 10.2) On Chomsky and the Two Cultures of Statistical Learning
Thu Jan 23 Advanced Variants of RNNs, Attention [slides] [notes (lectures 5 and 6)] Suggested Readings: Learning long-term dependencies with gradient descent is difficult (one of the original vanishing gradient papers) On the difficulty of training Recurrent Neural Networks (proof of vanishing gradient problem) Vanishing Gradients Jupyter Notebook (demo for feedforward networks) Attention Is All You Need Assignment 3 out [code] [handout] [latex template] Assignment 2 due
Week 4Tue Jan 28 Final Projects: Custom and Default; Practical Tips [slides] Suggested Readings: Practical Methodology (Deep Learning book chapter)
Thu Jan 30 Transformers [slides] [Custom project tips] [notes] Suggested Readings: Attention Is All You Need The Illustrated Transformer Transformer (Google AI blog post) Layer Normalization Image Transformer Music Transformer: Generating music with long-term structure Jurafsky and Martin Chapter 9 (The Transformer) Project Proposal out [handout] Default Final Project out [handout]
Week 5Tue Feb 4 Pretraining [slides] Suggested Readings: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Contextual Word Representations: A Contextual Introduction The Illustrated BERT, ELMo, and co. Jurafsky and Martin Chapter 11 (Masked Language Models) Assignment 4 out [code] [handout] [overleaf] [colab run script] Assignment 3 due
Thu Feb 6 Post-training (RLHF, SFT, DPO) [slides] Suggested Readings: Aligning language models to follow instructions Scaling Instruction-Finetuned Language Models AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Fri Feb 7 Hugging Face Transformers Tutorial Session [colab] Time 1:30pm-2:20pm Location Gates B01
Week 6Tue Feb 11 Efficient Adaptation (Prompting + PEFT) [slides] Suggested Readings: Language Models are Few-Shot Learners Chain-of-Thought Prompting Elicits Reasoning in Large Language Models The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks LoRA: Low-Rank Adaptation of Large Language Models Parameter-Efficient Transfer Learning for NLP Project Proposal due
Thu Feb 13 Benchmarking and Evaluation [slides] Suggested Readings: Challenges and Opportunities in NLP Benchmarking Measuring Massive Multitask Language Understanding Holistic Evaluation of Language Models AlpacaEval Project Milestone out [handout] Assignment 4 due
Week 7Tue Feb 18 Question Answering and Knowledge [slides] Suggested readings: SQuAD: 100,000+ Questions for Machine Comprehension of Text Dense Passage Retrieval for Open-Domain Question Answering Bidirectional Attention Flow for Machine Comprehension Reading Wikipedia to Answer Open-Domain Questions REALM: Retrieval-Augmented Language Model Pre-Training Lost in the Middle: How Language Models Use Long Contexts Final Project Proposals Returned
Thu Feb 20 Guest Lecture (Alexander Rush) Suggested readings:
Week 8Tue Feb 25 Guest Lecture: A Retrieval-based LM at Scale (Sewon Min) Suggested readings: Dense Passage Retrieval for Open-Domain Question Answering In-Context Retrieval-Augmented Language Models Scaling Retrieval-Based Language Models with a Trillion-Token Datastore SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore Final Project Milestone due
Thu Feb 27 Guest Lecture: The Second Half: Model Evaluation and Benchmarking (Shunyu Yao) Suggested readings: WebShop: Towards Scalable Rea-World Web Interaction with Grounded Language Agents Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena SWE-bench: Can Language Models Resolve Real-World GitHub Issues? Tau-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains Final Project Report Instructions out [Instructions]
Fri Feb 28 Course Withdrawal Deadline
Week 9Tue Mar 4 Guest Lecture: Model Analysis and Interpretability (Jing Huang) [slides] Suggested readings: Adversarial Examples for Evaluating Reading Comprehension Systems BERT Rediscovers the Classical NLP Pipeline Axiomatic Attribution for Deep Networks Investigating Gender Bias in Language Models Using Causal Mediation Analysis Faithful, Interpretable Model Explanations via Causal Abstraction Final Project Milestones Returned
Thu Mar 6 Guest Lecture: Reasoning (Noam Brown) Suggested readings: Learning to Reason with LLMs OpenAI o1 System Card DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Week 10Tue Mar 11 Open Questions in NLP 2025
Thu Mar 13 Final Project Emergency Assistance (No Lecture) Final project due
Finals Week Final Project Poster Session Time: Mar 18th, Time 12:15pm-3:15pm Location AOERC On-campus students must attend in person! [Printing guide]