Ming-Yu Liu (original) (raw)

02:00pm-02:50pm
Introduction to deep learning in computer vision
We will introduce the standard supervised deep learning model such as CNN and RNN. The introduction will include: a brief history of the development of both models, their model architectures and intuition, training methods and tips for learning a good model.

02:50pm-03:10pm
Coffee Break

03:10pm-04:00pm
Deep learning for vision and language
We will introduce a number of recent works related to image captioning, image question answering, video captioning, and video question answering. Our introduction includes new deep learning models to combine vision and language, popular dataset, and future challenges.

04:10-5:00pm
Deep learning for image generation
We will introduce a number of recent deep learning works on image generation. Various practical algorithm design issues will be discussed. Particularly. we will highlight the generative adversarial network problem.

Reference

Ian Goodfellow and Yoshua Bengio and Aaron Courville, Deep Learning, MIT Press 2016

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio, Generative Adversarial Networks, NIPS 2014

Alec Radford, Luke Metz, Soumith Chintala, Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR 2016

Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen, Improved Techniques for Training GANs, NIPS 2016

Ming-Yu Liu and Oncel Tuzel, Coupled Generative Adversarial Networks, NIPS 2016

Kuo-Hao Zeng, Tseng-Hung Chen, Juan Carlos Niebles, and Min Sun, Video Title Generation, ECCV 2016