CS 330 Deep Multi-Task and Meta Learning (original) (raw)
Fall 2021, Class: Mon, Wed 11:30am-1:00pm, NVIDIA Auditorium
Description:
While deep learning has achieved remarkable success in supervised and reinforcement learning problems, such as image classification, speech recognition, and game playing, these models are, to a large degree, specialized for the single task they are trained for. This course will cover the setting where there are multiple tasks to be solved, and study how the structure arising from multiple tasks can be leveraged to learn more efficiently or effectively. This includes:
- goal-conditioned reinforcement learning techniques that leverage the structure of the provided goal space to learn many tasks significantly faster
- meta-learning methods that aim to learn efficient learning algorithms that can learn new tasks quickly
- curriculum and lifelong learning, where the problem requires learning a sequence of tasks, leveraging their shared structure to enable knowledge transfer
This is a graduate-level course. By the end of the course, students will be able to understand and implement the state-of-the-art multi-task learning and meta-learning algorithms and be ready to conduct research on these topics.
Format:
The course will include in-person lectures (also livestreamed and recorded over zoom), three graded homework assignments, two optional homework assignments, and a course project. The lectures will discuss the fundamentals of topics required for understanding and designing multi-task and meta-learning algorithms in both supervised learning and reinforcement learning domains. The assignments will focus on coding problems that emphasize these fundamentals. Finally, students will present their projects at the poster session at the end of the quarter.
Prerequisites:
CS 229 or an equivalent introductory machine learning course is required. CS 221 or an equivalent introductory artificial intelligence course is recommended but not required.
Lecture Videos:
If you are looking for publicly-available lecture videos from the Fall 2019 offering, they are here. Materials from the Fall 2019 offering are here and materials from the Fall 2020 offering are here. Lecture videos from this Fall 2021 offering will be processed and made publicly available after the course. For students enrolled in the course, recorded lecture videos will be posted to canvas after each lecture.
Staff
Prof. Chelsea Finn
Instructor
OH: Wed 1-2:30pm
Webpage
Dr. Karol Hausman
Instructor
Webpage
Amelie Byun
Course Coordinator
Rafael Rafailov
Head Teaching Assistant
OH: Thur 5:00-7:00pm
Dilip Arumugam
Teaching Assistant
OH: Fri 12:30-2:30pm
Archit Sharma
Teaching Assistant
OH: Thur 1:00-3:00pm
Evan Liu
Teaching Assistant
OH: Tue 3:00-5:00pm
Kevin Yu
Teaching Assistant
OH: Sun 3:00-5:00pm
Kyle Hsu
Teaching Assistant
OH: Tue 11:00am-1:00pm
Siyi Tang
Teaching Assistant
OH: Mon 1:00-3:00pm
Kamil Ali
Teaching Assistant
OH: Sun 1:00-3:00pm
Edwin Pan
Teaching Assistant
OH: Sat 5:00-7:00pm
Timeline
Date | Lecture | Deadlines | Optional reading |
---|---|---|---|
Week 1 Mon, Sep 20 | Lecture Course introduction & start of multi-task learning (Chelsea Finn & Karol Hausman) | Optional Homework 0 out [PDF][Zip] | |
Week 1 Wed, Sep 22 | Lecture Supervised multi-task learning, transfer learning (Chelsea Finn) | P1: Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. Kendall et al. (2018) P2: Universal Language Model Fine-tuning for Text Classification. Howard et al. (2018) | |
Week 1 Thur, Sep 23 | TA Session PyTorch tutorial | Tutorial is at 6pm over zoom and will be recorded. | |
Week 2 Mon, Sep 27 | Lecture Meta-learning problem statement, black-box meta-learning (Chelsea Finn) | Due Optional Homework 0 Homework 1 out [PDF][Zip] | P1:One-shot Learning with Memory-Augmented Neural Networks. Santoro et al. (2016) |
Week 2 Wed, Sep 29 | Lecture Optimization-based meta-learning (Chelsea Finn) | P1: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. Finn et al. (2017) P2: Meta-Learning with Differentiable Convex Optimization. Lee et al. (2019) | |
Week 3 Mon, Oct 4 | Lecture Few-shot learning via metric learning (Chelsea Finn) | P1: Matching Networks for One Shot Learning. Vinyals et al. (2017) P2: Prototypical Networks for Few-shot Learning. Snell et al. (2017) | |
Week 3 Wed, Oct 6 | Lecture Advanced meta-learning topics (Chelsea Finn) | Due Homework 1 Homework 2 out[PDF][Zip] | P1: Meta-Learning without Memorization. Yin et al. (2020) |
Week 3 Thur, Oct 7 | TA Session [Variational inference tutorial](material/Variational Inference Tutorial.pdf) | Tutorial is at 4pm over zoom and will be recorded. | |
Week 4 Mon, Oct 11 | Lecture Bayesian meta-learning (Chelsea Finn) | P1: Conditional Neural Processes. Garnelo et al. (2018) P2: Meta-Learning Probabilistic Inference For Prediction. Gordon et al. (2019) | |
Week 4 Tue, Oct 12 | TA Session Reinforcement learning tutorial | Tutorial is at 5pm over zoom and will be recorded. | |
Week 4 Wed, Oct 13 | Leacture RL review (Karol Hausman) | Due Project proposal | |
Week 5 Mon, Oct 18 | Lecture Multi-task RL and goal-conditioned RL (Karol Hausman) | Due Homework 2 Homework 3 out[PDF][Zip] | P1: Hindsight Experience Replay. Andrychowicz et al. (2018) P2: MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale. Kalashnikov et al. (2021) |
Week 5 Wed, Oct 20 | Lecture Model-based RL for multi-task learning (Chelsea Finn) | P1: Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control. Ebert et al. (2018) P2: Deep Dynamics Models for Learning Dexterous Manipulation. Nagabandi et al. (2019) | |
Week 6 Mon, Oct 25 | Lecture Meta-RL 1: Adaptable models and policies (Chelsea Finn) | P1: RL2: Fast Reinforcement Learning via Slow Reinforcement Learning. Duan et al. (2016) | |
Week 6 Wed, Oct 27 | Lecture Meta-RL 2: Learning to explore (Chelsea Finn) | Due Homework 3 Optional Homework 4 out[PDF][Zip] | P1: Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables. Rakelly et al. (2019) P2: Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices. Liu et al. (2021) |
Week 7 Mon, Nov 1 | Lecture Offline RL and offline MTRL (Karol Hausman) | P1: Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems. Levine et al. (2020) P2: Conservative Q-Learning for Offline Reinforcement Learning. Kumar et al. (2020) P3: Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills. Chebotar et al. (2021) | |
Week 7 Wed, Nov 3 | Lecture Hierarchical RL and skill discovery (Karol Hausman) | P1: Data-Efficient Hierarchical Reinforcement Learning. Nachum et al. (2018) P2: Diversity is All You Need: Learning Skills without a Reward Function. Eysenbach et al. (2018) P3: Dynamics-Aware Unsupervised Discovery of Skills. Sharma et al. (2019) | |
Week 8 Mon, Nov 8 | Guest Lecture What do language models learn from language modeling? (Colin Raffel (UNC, Hugging Face)) | Due Optional Homework 4 | Lecture is from 12-1pm (virtual). |
Week 8 Wed, Nov 10 | Guest Lecture [ Learned optimizers: why they're the future, why they're hard, and what they can do now](slides/Stanford CS330 Jascha Sohl-Dickstein guest lecture on Learned optimizers.pdf) (Jascha Sohl-Dickstein (Google)) | Due Project milestone | |
Week 9 Mon, Nov 15 | Lecture Lifelong learning: problem statements, forward & backward transfer (Karol Hausman) | P1: Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning. Julian et al. (2020) P2: Online Meta-Learning. Finn et al. (2019) Lecture will be over zoom and recorded. | |
Week 9 Wed, Nov 17 | Lecture Frontiers and open problems (Karol Hausman) | Guest speaker slides: Learning to edit pre-trained models. (Eric Mitchell) [ BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning.](slides/BC-Z CS330 Lightning Talk.pdf) (Eric Jang) [ Language Conditioned Imitation Learning Over Unstructured Data.](slides/Stanford LangLfP talk.pdf) (Corey Lynch) Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation. (Suraj Nair) [ Autonomous Reinforcement Learng.](slides/Autonomous RL %28CS330%29.pdf) (Archit Sharma) Learning Generalizable Robotic Reward Functions from “In-The-Wild” Human Videos. (Annie Chen) [ Reinforcement Learning as Sequence Modeling.](slides/Stanford Short Talk.pdf) (Igor Mordatch) [ Lifelong Robotic Reinforcement Learning by Retaining Experiences. ](slides/retaining experiences cs330 slides.pdf) (Annie Xie) Lecture will be over zoom and recorded. | |
Week 10 Mon, Nov 22 | Thanksgiving Break | ||
Week 10 Wed, Nov 24 | Thanksgiving Break | ||
Week 11 Mon, Nov 29 | No Class | ||
Week 11 Tue, Nov 30 | Presentations Project poster session | The poster session will be held in McCaw Hall at the Alumni Center from 11:30am-1:30pm. | |
Week 12 Wed, Dec 8 | No Class | Due Final project report |
Grading and Course Policies
Homeworks (15% each): There are three graded homework assignments (Homeworks 1-3), each worth 15% of the grade. Assignments will require training neural networks in PyTorch. Homework 0 is optional and is worth 2% extra grade. Homework 4 is optional and will either replace one prior homework grade or part of the project grade (whichever is better for grade). All assignments are due on Gradescope at 11:59 pm Pacific Time on the respective due date.
Project (55%): There's a research-level project of your choice. You may form groups of 1-3 students to complete the project, and you are encouraged to start early! Detailed guidelines on the project can be found here.
Late Days: You have 6 total late days across homeworks and project proposal submissions. You may use a maximum of 2 late days for any single assignment. Late days are not allowed for final project report.
Lecture Attendance: While we do not require lecture attendance, students are encouraged to join the live lecture. To accommodate various circumstances, we will be live-streaming the in-person lecture via a zoom link on canvas. Students joining the zoom live stream will be able to ask questions. For those who cannot join the live lectures, lecture recordings will also be available on Canvas shortly following the lecture.
Honor Code: Students are free to form study groups and may discuss homework in groups. However, each student must write down the solutions and code from scratch independently, and without referring to any written notes from the joint session. When debugging code together, you are only allowed to look at the input-output behavior of each other's programs and not the code itself. In other words, each student must understand the solution well enough in order to reconstruct it by him/herself. It is an honor code violation to copy, refer to, or look at written or code solutions from a previous year, including but not limited to: official solutions from a previous year, solutions posted online, and solutions you or someone else may have written up in a previous year. Furthermore, it is an honor code violation to post your assignment solutions online, such as on a public git repo. For more details about honor code, see The Stanford Honor Code and The Stanford Honor Code Pertaining to CS Courses.
Health and Safety Expectation (COVID-19 Protocols)
Following Stanford’s policies, everyone is required to wear a mask indoors, regardless of vaccination status. This includes any in-person lectures or office hour sessions. Some community members may have preferences that go beyond the requirements; it is important that we treat each others' preferences with respect and care. You can find the most current policies on campus masking requirements on the COVID-19 Health Alerts site.
Academic Accommodations
If you need an academic accommodation based on a disability, please register with the Office of Accessible Education (OAE). Professional staff will evaluate your needs, support appropriate and reasonable accommodations, and prepare an Academic Accommodation Letter for faculty. To get started, or to re-initiate services, please visit oae.stanford.edu. If you already have an Academic Accommodation Letter, please send your letter to cs330-aut2122-staff@lists.stanford.edu. OAE Letters should be sent to us at the earliest possible opportunity so that the course staff can partner with you and OAE to make the appropriate accommodations.
Note on Financial Aid
All students should retain receipts for books and other course-related expenses, as these may be qualified educational expenses for tax purposes. If you are an undergraduate receiving financial aid, you may be eligible for additional financial aid for required books and course materials if these expenses exceed the aid amount in your award letter. For more information, review your award letter or visit the Student Budget website.