• People
  • Courses

Material for the Deep Learning Course

On-Line Material from Other Sources

  • A quick overview of some of the material contained in the course is available from my ICML 2013 tutorial on Deep Learning:
  • Q&A about deep learning (Spring 2013 course on large-scale ML)
  • 2012 IPAM Summer School deep learning and representation learning
  • 2014 International Conference on Learning Representations (ICLR 2014)

Week 1

2014-01-27 Lecture

* Intro to Deep Learning

2014-01-29 Lab

* Roy Lowrance's tutorial on Lua

Week 2

2014-02-03 Lecture

* Modular Learning, Neural Nets and Backprop

  • Slides: PDF | DjVu
  • Topics: : Backprop, modular models
  • Reading Material:
    • Gradient-Based Learning Applied to Document Recognition (Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, 1998): pages 1-5 (part I) PDF | DjVu
    • ?: Additional readings: ICML 2013 pp 34 - 53?
2014-02-05 Lab

* Clement Farabet's tutorial on the Torch ML library

Week 3

2014-02-10 Lecture

* Mixture of experts, recurrent nets, intro to ConvNets

  • Slides: PDF | DjVu
  • Topics: : Discussion of some modules, Sum/branch, Switch, Logsum module; RBF Net; MAP/MLE loss; Parameter Space Transforms; Convolutional Module
  • Reading Material:
    • Gradient-Based Learning Applied to Document Recognition (Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, 1998): pages 5-16 (part II and III) PDF | DjVu
2014-02-12 Lab

* Unscheduled

Week 4

2014-02-17 Lecture
2014-02-19 Lab

* Unscheduled

Week 5

2014-02-24 Lecture

Guest lecture by Rob Fergus on Conv nets

  • Topics:
  • Reading Material:
    • Yann LeCun CVPR talk on scene understanding
    • Sermanet et al. ICLR 2014: “OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks” arXiv
    • LeCun. ECCV 2012 “Learning Invariant Feature Hierarchies”: PDF, DjVu
    • Sermanet et al. ICPR 2012 “Convolutional Neural Networks Applied to House Numbers Digit Classification”: PDF, DjVu
    • Farabet et al. PAMI 2013 “Learning Hierarchical Features for Scene Labeling”: PDF, DjVu
    • Sermanet et al. CVPR 2013 “Pedestrian Detection with Unsupervised Multi-Stage Feature Learning”: PDF,DjVu
2014-02-26 Lab

* Unscheduled

Week 6

2014-03-03 Lecture

* Energy–Based Models for Supervised Learning

  • Slides: PDF | DjVu
  • Topics: : energy for inference, objective for learning, loss functionals.
  • Reading Material:
    • Yann LeCun, Sumit Chopra, Raia Hadsell, Marc'Aurelio Ranzato and Fu-Jie Huang: A Tutorial on Energy-Based Learning, in Bakir, G. and Hofman, T. and Schölkopf, B. and Smola, A. and Taskar, B. (Eds), Predicting Structured Data, MIT Press, 2006 PDF, DjVu
  • Other On-Line Material:
2014-03-05 Lab

* Optimization Tricks for Deep Learning and Computer Vision

  • Topics: : Aspect Ratio, Randomization, Normalization mean / std, Channel Decorrelation
  • Reading Material:
    • Y. LeCun, L. Bottou, G. Orr and K. Muller: Efficient BackProp, in Orr, G. and Muller K. (Eds), Neural Networks: Tricks of the trade, Springer, 1998. PDF | DjVu

Week 7

2014-03-10 Lecture

* Energy-Based Models for Unsupervised Learning

  • Slides: PDF | DjVu
  • Topics: : Learning energy function is hard. These are different strategies (?); Use PCA; NLL: problem intractable; Contrastive Divergence; Just learn E-surface around datapoints; Denoising AE (with drawing on blackboard); Sparse coding
  • Reading Material:
2014-03-12 Lab

* Optimization for Deep Learning

  • Slides: PDF | DjVu
  • Topics: : Importance of normalization: no stretched ellipses.; Newton algorithm / Hessian Estimation algorithms (?); 1-hidden NN
  • Notes from Lab session: PDF | DjVu
  • Notes from the Blackboard: 1 and 2
  • Video (2013 part 1)
  • Video (2013 part 2)
  • Reading Material:
    • Y. LeCun, L. Bottou, G. Orr and K. Muller: Efficient BackProp, in Orr, G. and Muller K. (Eds), Neural Networks: Tricks of the trade, Springer, 1998. PDF | DjVu

Spring Break 03-17 to 03-23

Week 8

2014-03-24 Lecture
2014-03-26 Lab

* Metric Learning and Optimization / Dr Lim

  • Topics: : NCA; Dr Lim
  • Reading Material:
    • Not relevant

Week 9

2014-03-31 Lecture

* Latent Factor Graphs

  • Topics: : Latent Variable Models, Probabilistic LVM, Loss Function, Example handwriting recognition
  • Reading Material:
    • Maybe this are the Slides: PDF | DjVu
    • This is covered in the energy learning tutorial
    • Video (2013)
2014-04-02 Lab

* Unscheduled

Week 10

2014-04-07 Lecture

* Restricted Boltzmann Machines

2014-04-09 Lab

* Optimization for Deep Learning?

Week 11

2014-04-14 Lecture

* Guest Lecture by Antoine Bordes on NLP

2014-04-16 Lab

* Unscheduled

Week 12

2014-04-21 Lecture

* Energy-Based Models for Unsupervised Learning

2014-04-23 Lab

* Recurrent Networks Lab

Week 13

2014-04-28 Lecture

* Speech Recognition / Structured Prediction

  • Topics: FFT/DFT, Time Delay Conv Nets, Acoustic Modeling
  • Reading Material:
    • Not relevant
2014-04-30 Lab

* Discussion of Project Topics

Week 14

2014-05-05 Lecture

* Back propagation, History of Deep Learning

  • Topics: Lagrange derivation of back propagation, development of neural networks and deep learning since the 1940s
  • Reading Material:
    • Not relevant
2014-05-07 Lab

* Sparse Coding

  • Topics: ISTA/FISTA/LISTA
  • Reading Material:
    • Not relevant

Week 15

* Final Exam Period May 12 to May 19

  • Final Project May 16
  • If you are not graduating and need an extension talk to the TA Liu Hao: haoliu [ at ] nyu.edu
  • Final Exam May 19

Final Exam Topics

  • the reasons for deep learning.
  • fprop/bprop: here is the fprop function for a module. Write the bprop.
  • modules you should know about:
    • linear, point-wise non-linearity, max,
    • Y branch, square distance, log-softmax
  • loss functions: least square, cross-entropy, hinge
  • energy-based supervised learning: energy/inference - objective function/learning
  • loss functionals: energy loss, negative log likelihood, perceptron, hinge
  • metric learning, siamese nets
  • DrLIM, WSABIE criteria
  • network architectures:
    • shared weights and other weight space transformations
    • recurrent nets: basic algorithm for backprop-through-time
  • mixture of experts
  • convolutional nets:
    • architecture, usage, for image and speech recognition and detection of objects in images
  • optimization:
    • SGD
    • tricks to make learning efficient: data normalization and such.
    • computing 2nd derivatives (diagonal terms)
  • deep learning + structured prediction
  • inference through energy minimization and marginalization
  • latent variables E(X,Y,Z) → F(X,Y)
  • learning using a loss functional
  • applications to sequence processing (e.g. Speech and handwriting recognition)
  • applications:
    • speech and audio (temporal convnets)
    • image (spatial convnets)
    • text (see Jason Weston and Antoine Bordes¹ lectures)
  • unsupervised learning:
    • basic idea of energy-based unsupervised learning
    • the 7 methods to make the energy low on/near the samples and high everywhere else
    • sparse coding and sparse auto-encoders
    • ISTA/FISTA, LISTA
    • group sparsity

Final Exam Sample

 
 
/srv/www/cilvr/htdocs/data/pages/deeplearning/slides/start.txt · Last modified: 2014/05/17 23:03 by cp
Recent changes RSS feed Creative Commons License Valid XHTML 1.0 Valid CSS Driven by DokuWiki
Drupal Garland Theme for Dokuwiki