CSE 705: Deep Learning

For beginner, start here! Neural networks and deep learning (Michael Nielsen -- on-going book -- very good Introductory materials!)
Yingbo and Devansh Learning deep architectures for AI (Yoshua Bengio -- Foundations and Trends in ML)
Good overview! Representation Learning: A Review and New Perspectives, Yoshua Bengio, Aaron Courville, Pascal Vincent, Arxiv, 2012.
Deep learning, methods and applications (NOW book, Li Deng and Dong Yu, good overview for people who already know the basics)
A recent deep learning course at CMU (with links to many classic papers in the field)
Deep learning, Yoshua Bengio, Ian Goodfellow and Aaron Courville (sketchy on-going online book)
Deep Machine Learning: A New Frontier in Artificial Intelligence Research", Itamar Arel, Derek C. Rose, and Thomas P. Karnowski,
Deep Learning in Neural Networks: An Overview, Schmidhuber, J. (2014).
Yann Lecunn's Lecture video and slides. (Thanks Buddhika for these links.)

Start here Ben Recht's talk at Simons (two short lectures, totally worth watching).
Mahmoud and Hung The Zen of gradient decent (a short blog post).
Mahmoud and Hung Theory of convex optimization for ML (on-line monograph by Sebastien Bubeck).

Sijia Liu Cybenko., G. (1989) "Approximations by superpositions of sigmoidal functions", Mathematics of Control, Signals, and Systems, 2 (4), 303-314. [ pdf ]
Kurt Hornik, Maxwell B. Stinchcombe, Halbert White: Multilayer feedforward networks are universal approximators. Neural Networks 2(5): 359-366 (1989) [ pdf ]
Someone please present this Andrew R. Barron, Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information Theory 39(3): 930-945 (1993) [ pdf ]
Kurt Hornik (1991) "Approximation Capabilities of Multilayer Feedforward Networks", Neural Networks, 4(2), 251–257
Someone please present this Hava T. Siegelmann, Eduardo D. Sontag: On the Computational Power of Neural Nets. J. Comput. Syst. Sci. 50(1): 132-150 (1995) [ pdf ]
Qi Oliver Delalleau and Yoshua Bengio, Shallow vs. Deep Sum-Product Networks, NIPS 2011. [ pdf ]

Xiaowei H. Ackley , E. Hinton , J. Sejnowski, "A learning algorithm for Boltzmann machines", Cognitive Science, 9, 147-169, 1985. [ pdf ]
Tutorial on RBM.

Xiaowei Salakhutdinov, Ruslan, and Geoffrey E. Hinton. "Deep boltzmann machines." Proceedings of the international conference on artificial intelligence and statistics. Vol. 5. No. 2. Cambridge, MA: MIT Press, 2009. [ pdf ]
Zhen Xu Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. 2006. "A fast learning algorithm for deep belief nets." Neural Comput. 18, 7 (July 2006), 1527-1554. [ pdf ]

Qi Sanjeev Arora and Aditya Bhaskara and Rong Ge and Tengyu Ma Provable Bounds for Learning Some Deep Representations. ICML 2014. [ pdf ]
Laknath James Martens, Ilya Sutskever: Training Deep and Recurrent Networks with Hessian-Free Optimization. Neural Networks: Tricks of the Trade (2nd ed.) 2012: 479-535. Also ICML 2012. [ pdf ]
Ying Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. Why Does Unsupervised Pre-training Help Deep Learning? JMLR 2010 [ pdf ]
Duc Luong Ian J. Goodfellow, Quoc V. Le, Andrew M. Saxe, Honglak Lee and Andrew Y. Ng. Measuring invariances in deep networks. NIPS 2009. [ pdf ]

Rohit Guillaume Alain and Yoshua Bengio, "What Regularized Auto-Encoders Learn from the Data Generating Distribution", [ pdf ]

CSE 705: Deep Learning (Spring 2015)