CS 330 Deep Multi-Task and Meta Learning (original) (raw)

Fall 2021, Class: Mon, Wed 11:30am-1:00pm, NVIDIA Auditorium

Description:

While deep learning has achieved remarkable success in supervised and reinforcement learning problems, such as image classification, speech recognition, and game playing, these models are, to a large degree, specialized for the single task they are trained for. This course will cover the setting where there are multiple tasks to be solved, and study how the structure arising from multiple tasks can be leveraged to learn more efficiently or effectively. This includes:

goal-conditioned reinforcement learning techniques that leverage the structure of the provided goal space to learn many tasks significantly faster
meta-learning methods that aim to learn efficient learning algorithms that can learn new tasks quickly
curriculum and lifelong learning, where the problem requires learning a sequence of tasks, leveraging their shared structure to enable knowledge transfer

This is a graduate-level course. By the end of the course, students will be able to understand and implement the state-of-the-art multi-task learning and meta-learning algorithms and be ready to conduct research on these topics.

Format:

The course will include in-person lectures (also livestreamed and recorded over zoom), three graded homework assignments, two optional homework assignments, and a course project. The lectures will discuss the fundamentals of topics required for understanding and designing multi-task and meta-learning algorithms in both supervised learning and reinforcement learning domains. The assignments will focus on coding problems that emphasize these fundamentals. Finally, students will present their projects at the poster session at the end of the quarter.

Prerequisites:

CS 229 or an equivalent introductory machine learning course is required. CS 221 or an equivalent introductory artificial intelligence course is recommended but not required.

Lecture Videos:

If you are looking for publicly-available lecture videos from the Fall 2019 offering, they are here. Materials from the Fall 2019 offering are here and materials from the Fall 2020 offering are here. Lecture videos from this Fall 2021 offering will be processed and made publicly available after the course. For students enrolled in the course, recorded lecture videos will be posted to canvas after each lecture.

Staff

Chelsea Finn

Prof. Chelsea Finn

Instructor

OH: Wed 1-2:30pm

Webpage

Karol Hausman

Dr. Karol Hausman

Instructor

Webpage

Amelie Byun

Amelie Byun

Course Coordinator

Rafael Rafailov

Rafael Rafailov

Head Teaching Assistant

OH: Thur 5:00-7:00pm

Dilip Arumugam

Dilip Arumugam

Teaching Assistant

OH: Fri 12:30-2:30pm

Archit Sharma

Archit Sharma

Teaching Assistant

OH: Thur 1:00-3:00pm

Evan Liu

Evan Liu

Teaching Assistant

OH: Tue 3:00-5:00pm

Kevin Yu

Kevin Yu

Teaching Assistant

OH: Sun 3:00-5:00pm

Kyle Hsu

Kyle Hsu

Teaching Assistant

OH: Tue 11:00am-1:00pm

Siyi Tang

Siyi Tang

Teaching Assistant

OH: Mon 1:00-3:00pm

Kamil Ali

Kamil Ali

Teaching Assistant

OH: Sun 1:00-3:00pm

Edwin Pan

Edwin Pan

Teaching Assistant

OH: Sat 5:00-7:00pm

Timeline

Date	Lecture	Deadlines	Optional reading
Week 1 Mon, Sep 20	Lecture Course introduction & start of multi-task learning (Chelsea Finn & Karol Hausman)	Optional Homework 0 out [PDF][Zip]
Week 1 Wed, Sep 22	Lecture Supervised multi-task learning, transfer learning (Chelsea Finn)		P1: Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. Kendall et al. (2018) P2: Universal Language Model Fine-tuning for Text Classification. Howard et al. (2018)
Week 1 Thur, Sep 23	TA Session PyTorch tutorial		Tutorial is at 6pm over zoom and will be recorded.
Week 2 Mon, Sep 27	Lecture Meta-learning problem statement, black-box meta-learning (Chelsea Finn)	Due Optional Homework 0 Homework 1 out [PDF][Zip]	P1:One-shot Learning with Memory-Augmented Neural Networks. Santoro et al. (2016)
Week 2 Wed, Sep 29	Lecture Optimization-based meta-learning (Chelsea Finn)		P1: Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. Finn et al. (2017) P2: Meta-Learning with Differentiable Convex Optimization. Lee et al. (2019)
Week 3 Mon, Oct 4	Lecture Few-shot learning via metric learning (Chelsea Finn)		P1: Matching Networks for One Shot Learning. Vinyals et al. (2017) P2: Prototypical Networks for Few-shot Learning. Snell et al. (2017)
Week 3 Wed, Oct 6	Lecture Advanced meta-learning topics (Chelsea Finn)	Due Homework 1 Homework 2 out[PDF][Zip]	P1: Meta-Learning without Memorization. Yin et al. (2020)
Week 3 Thur, Oct 7	TA Session [Variational inference tutorial](material/Variational Inference Tutorial.pdf)		Tutorial is at 4pm over zoom and will be recorded.
Week 4 Mon, Oct 11	Lecture Bayesian meta-learning (Chelsea Finn)		P1: Conditional Neural Processes. Garnelo et al. (2018) P2: Meta-Learning Probabilistic Inference For Prediction. Gordon et al. (2019)
Week 4 Tue, Oct 12	TA Session Reinforcement learning tutorial		Tutorial is at 5pm over zoom and will be recorded.
Week 4 Wed, Oct 13	Leacture RL review (Karol Hausman)	Due Project proposal
Week 5 Mon, Oct 18	Lecture Multi-task RL and goal-conditioned RL (Karol Hausman)	Due Homework 2 Homework 3 out[PDF][Zip]	P1: Hindsight Experience Replay. Andrychowicz et al. (2018) P2: MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale. Kalashnikov et al. (2021)
Week 5 Wed, Oct 20	Lecture Model-based RL for multi-task learning (Chelsea Finn)		P1: Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control. Ebert et al. (2018) P2: Deep Dynamics Models for Learning Dexterous Manipulation. Nagabandi et al. (2019)
Week 6 Mon, Oct 25	Lecture Meta-RL 1: Adaptable models and policies (Chelsea Finn)		P1: RL2: Fast Reinforcement Learning via Slow Reinforcement Learning. Duan et al. (2016)
Week 6 Wed, Oct 27	Lecture Meta-RL 2: Learning to explore (Chelsea Finn)	Due Homework 3 Optional Homework 4 out[PDF][Zip]	P1: Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables. Rakelly et al. (2019) P2: Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices. Liu et al. (2021)
Week 7 Mon, Nov 1	Lecture Offline RL and offline MTRL (Karol Hausman)		P1: Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems. Levine et al. (2020) P2: Conservative Q-Learning for Offline Reinforcement Learning. Kumar et al. (2020) P3: Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills. Chebotar et al. (2021)
Week 7 Wed, Nov 3	Lecture Hierarchical RL and skill discovery (Karol Hausman)		P1: Data-Efficient Hierarchical Reinforcement Learning. Nachum et al. (2018) P2: Diversity is All You Need: Learning Skills without a Reward Function. Eysenbach et al. (2018) P3: Dynamics-Aware Unsupervised Discovery of Skills. Sharma et al. (2019)
Week 8 Mon, Nov 8	Guest Lecture What do language models learn from language modeling? (Colin Raffel (UNC, Hugging Face))	Due Optional Homework 4	Lecture is from 12-1pm (virtual).
Week 8 Wed, Nov 10	Guest Lecture [ Learned optimizers: why they're the future, why they're hard, and what they can do now](slides/Stanford CS330 Jascha Sohl-Dickstein guest lecture on Learned optimizers.pdf) (Jascha Sohl-Dickstein (Google))	Due Project milestone
Week 9 Mon, Nov 15	Lecture Lifelong learning: problem statements, forward & backward transfer (Karol Hausman)		P1: Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning. Julian et al. (2020) P2: Online Meta-Learning. Finn et al. (2019) Lecture will be over zoom and recorded.
Week 9 Wed, Nov 17	Lecture Frontiers and open problems (Karol Hausman)		Guest speaker slides: Learning to edit pre-trained models. (Eric Mitchell) [ BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning.](slides/BC-Z CS330 Lightning Talk.pdf) (Eric Jang) [ Language Conditioned Imitation Learning Over Unstructured Data.](slides/Stanford LangLfP talk.pdf) (Corey Lynch) Learning Language-Conditioned Robot Behavior from Offline Data and Crowd-Sourced Annotation. (Suraj Nair) [ Autonomous Reinforcement Learng.](slides/Autonomous RL %28CS330%29.pdf) (Archit Sharma) Learning Generalizable Robotic Reward Functions from “In-The-Wild” Human Videos. (Annie Chen) [ Reinforcement Learning as Sequence Modeling.](slides/Stanford Short Talk.pdf) (Igor Mordatch) [ Lifelong Robotic Reinforcement Learning by Retaining Experiences. ](slides/retaining experiences cs330 slides.pdf) (Annie Xie) Lecture will be over zoom and recorded.
Week 10 Mon, Nov 22	Thanksgiving Break
Week 10 Wed, Nov 24	Thanksgiving Break
Week 11 Mon, Nov 29	No Class
Week 11 Tue, Nov 30	Presentations Project poster session		The poster session will be held in McCaw Hall at the Alumni Center from 11:30am-1:30pm.
Week 12 Wed, Dec 8	No Class	Due Final project report

Grading and Course Policies

Homeworks (15% each): There are three graded homework assignments (Homeworks 1-3), each worth 15% of the grade. Assignments will require training neural networks in PyTorch. Homework 0 is optional and is worth 2% extra grade. Homework 4 is optional and will either replace one prior homework grade or part of the project grade (whichever is better for grade). All assignments are due on Gradescope at 11:59 pm Pacific Time on the respective due date.

Project (55%): There's a research-level project of your choice. You may form groups of 1-3 students to complete the project, and you are encouraged to start early! Detailed guidelines on the project can be found here.

Late Days: You have 6 total late days across homeworks and project proposal submissions. You may use a maximum of 2 late days for any single assignment. Late days are not allowed for final project report.

Lecture Attendance: While we do not require lecture attendance, students are encouraged to join the live lecture. To accommodate various circumstances, we will be live-streaming the in-person lecture via a zoom link on canvas. Students joining the zoom live stream will be able to ask questions. For those who cannot join the live lectures, lecture recordings will also be available on Canvas shortly following the lecture.

Honor Code: Students are free to form study groups and may discuss homework in groups. However, each student must write down the solutions and code from scratch independently, and without referring to any written notes from the joint session. When debugging code together, you are only allowed to look at the input-output behavior of each other's programs and not the code itself. In other words, each student must understand the solution well enough in order to reconstruct it by him/herself. It is an honor code violation to copy, refer to, or look at written or code solutions from a previous year, including but not limited to: official solutions from a previous year, solutions posted online, and solutions you or someone else may have written up in a previous year. Furthermore, it is an honor code violation to post your assignment solutions online, such as on a public git repo. For more details about honor code, see The Stanford Honor Code and The Stanford Honor Code Pertaining to CS Courses.

Health and Safety Expectation (COVID-19 Protocols)

Following Stanford’s policies, everyone is required to wear a mask indoors, regardless of vaccination status. This includes any in-person lectures or office hour sessions. Some community members may have preferences that go beyond the requirements; it is important that we treat each others' preferences with respect and care. You can find the most current policies on campus masking requirements on the COVID-19 Health Alerts site.

Academic Accommodations

If you need an academic accommodation based on a disability, please register with the Office of Accessible Education (OAE). Professional staff will evaluate your needs, support appropriate and reasonable accommodations, and prepare an Academic Accommodation Letter for faculty. To get started, or to re-initiate services, please visit oae.stanford.edu. If you already have an Academic Accommodation Letter, please send your letter to cs330-aut2122-staff@lists.stanford.edu. OAE Letters should be sent to us at the earliest possible opportunity so that the course staff can partner with you and OAE to make the appropriate accommodations.

Note on Financial Aid

All students should retain receipts for books and other course-related expenses, as these may be qualified educational expenses for tax purposes. If you are an undergraduate receiving financial aid, you may be eligible for additional financial aid for required books and course materials if these expenses exceed the aid amount in your award letter. For more information, review your award letter or visit the Student Budget website.