10701 Introduction to Machine Learning (original) (raw)

Syllabus and (tentative) Course Schedule

Date Lecture Topics Readings and useful links Anouncements
Module 1: Supversived Learning
Thu 9/3 Lecture 1 (Eric, Ziv): Intro to probability, MLE -Slides Introduction of the course Basic probability Maximum likelihood estimate
Tue 9/8 No class
Thu 9/10 Lecture 2 (Ziv): Classification, kNN -Slides,Video Optimal decision using Bayes rule Types of classifiers Effect of values of k on kNN classifiers Probabilistic interpretation of kNN
Tue 9/15 No class
Thu 9/17 Lecture 3 (Ziv): Decision trees -Slides (updated),Video Discriminative classifiers Entropy Information gain Building decision trees PS1 out
Tue 9/22 Lecture 4 (Ziv): Naïve Bayes -Slides,Quiz,Video Problems with estimating full joints Advantages of Naïve Bayes assumptions Applications to discrete and continuous cases Problems with Naïve Bayes classifiers Mitchell, 6.1-6.10
Thu 9/24 Lecture 5 (Ziv): Linear regression -Slides,Video Basic model Solving linear regression Error in linear regression Advanced regression models Bishop, 3.1
Tue 9/29 Lecture 6 (Ziv): Logistic regression -Slides,Video Logistic regression vs. linear regression Sigmoid funcion MLE via gradient ascent Regularization Logistic regression for multiple classes Bishop, 4.2-4.3
Thu 10/1 Lecture 7 (Eric): Perceptron, Neural networks -Slides,Video Perceptron Multilayer Perceptron Backpropagation "Deep" Learning Convolutional Neural Networks Layer-wise Pre-training Bishop, Ch. 3 Bengio, Learning Deep Architectures for AI Krizhevsky et al., ImageNet Classification with Deep Convolutional Neural Networks PS2 out,PS1 due (10/2)
Tue 10/6 Lecture 8 (Eric): Deep learning, SVM -Slides1,Slides2,Video "Deep" Learning Convolutional Neural Networks Support Vector Machines Deep Learning: Salakhutdinov, Learning Learning Deep Generative Models Ioffe and Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift Bahdanau et al., Neural machine translation by jointly learning to align and translate SVM: Burges, A Tutorial on Support Vector Machines for Pattern Recognition Schölkopf and Smola, Learning with Kernels
Thu 10/8 Lecture 9 (Eric): SVM -Slides,Annotated,Video Duality, KKT condition Kernel trick Sequential Minimal Optimization (SMO) Platt, Fast Training of Support Vector Machines using Sequential Minimal Optimization Chang and Lin, LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology Fan et al., LIBLINEAR: A Library for Large Linear Classification Proposal due
Tue 10/13 Lecture 10 (Eric): Evaluating classifiers, Bias-variance decomposition -Slides,Video Bias-variance decomposition Structural risk minimization Ways to avoid overfitting Vapnik, V. (1999) An Overview of Statistical Learning Theory, IEEE transactions on Neural Networks, Vol. 10, pp. 988-99.E.P. Xing, M.I. Jordan and R.M. Karp (2001), Feature selection for high-dimensional genomic microarray data, Proceedings of the Eighteenth International Conference on Machine Learning.Andrew Y Ng. On Feature Selection: Learning with Exponentially many Irrelevant Features as Training Examples. In Proceedings of the Fifteenth International Conference on Machine Learning, 1998.
Thu 10/15 Lecture 11 (Eric): Ensemble learning - Boosting, Random Forests -Slides,Video Combing weak learners Bagging and random forest AdaBoost, algorithem and generalization bounds Gradient boosting Freund and Schapire, A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting Schapire et al., Boosting the margin: A new explanation for the effectiveness of voting methods Breiman, Random Forests Friedman, Greedy Function Approximation: A Gradient Boosting Machine Chen, XGBoost, an efficient implementation of gradient boosting PS3 out,PS2 due (10/16)
Module 2: Unsupversived Learning
Tue 10/20 Lecture 12, 13 (Ziv): Unsupervised learning - clustering -Slides,Video Hierarchical clusteringK-means and Gaussian mixture modelsNumber of clusters Bishop, Ch. 9Optional: Mitchell, 6.12
Thu 10/22
Tue 10/27 Lecture 14 (Ziv): Semi-supervised learning -Slides,Video Re-weightingEM, data augmentationCo-trainingDetect overfitting
Thu 10/29 Lecture 15 (Eric): Learning theory -Slides,Annotated,Video Realizable vs agnosticPAC learning in finite concept classSample complexity Mitchell, Ch. 7Vapnik, V. (1999) An Overview of Statistical Learning Theory, IEEE transactions on Neural Networks, Vol. 10, pp. 988-99. PS4 out,PS3 due (10/30)
Tue 11/3 Lecture 16 (Eric): VC dimension -Slides,Video Sample complexity for infinite concept classesVC dimension as a complexity measureStructural risk minimization Ch. 3, An Introduction to Computational Learning Theory, M. Kearns and U. Vazirani
Module 3: Probabilistic Representation and Modeling
Thu 11/5 Lecture 17 (Eric): Graphical models, Bayes nets -Slides,Video An introduction to graphical models, M. I. Jordan Midway report due
Tue 11/10 Lecture 18 (Eric): Bayes nets -Slides,Video Bishop, 8.4
Thu 11/12 Lecture 19 (Eric): Undirected graphical modelsVideo PS4 due (11/13)
Sun 11/15 Midterm review at Doherty 2315, 4-6pm
Tue 11/17 No class, Midterm at DH 2315, 5pm
Thu 11/19 Lecture 20 (Ziv): HMM -Slides,Video PS5 out
Tue 11/24 Lecture 21 (Ziv): HMM inference -Slides
Thu 11/26 No class
Tue 12/1 Lecture 22 (Eric): MDPs, Reinforcement learning -Slides Sutton, Richard and Barto, Andrew: Reinforcement Learning: an introduction, MIT Press, Cambridge, MA, 1998 PS5 due
Thu 12/3 Lecture 23 (Eric): Topic models -Slides D. Blei et al., Latent Dirichlet Allocation T. Griffiths and M. Steyvers, Finding Scientific Topics Eric's ACL 2012 Tutorial. Topic Models, Latent Space Models, Sparse Coding, and All That: A systematic understanding of probabilistic semantic extraction in large corpus
Fri 12/4 Poster session, 2:30pm at NSH Atrium
Module 4: Applications of ML
Tue 12/8 Lecture 24 (Ziv): Computational biology -Slides Final report due (12/11)

© 2012 Eric Xing @ School of Computer Science, Carnegie Mellon University
[validate xhtml]