CSE 703: Deep Learning for Visual Recognition with Applications to Medical Imaging Analysis

Fall 2017

Basic Information

Description

Deep learning is providing exciting solutions for visual recognition problems and is seen as a key method for future applications. This course is a deep dive into the state-of-the-art algorithms in the convolutional neural network (CNN) architectures, image classification, object detection and segmentation. Medical imaging analysis, as a unique application of computer vision to provide computer-aided analysis, will also be covered. The course will be beneficial to students interested in understanding the massive amount of literature in computer vision and deep learning.

Prerequisites

Require fundamental knowledge about machine learning, computer vision and deep learning. Prior knowledge about medical imaging is not necessary. Preference will be given to current PhD students and Master students planning to apply for a PhD. I hope that some students from this seminar will join my research group as PhD students starting in spring 2018.

Grading

Only satisfactory (S) and unsatisfactory (U) will be given. Each student will be expected to present 2-3 papers and lead discussions. Students taking more than one credit need to complete a course project. Please come to the instructor’s office hours or make an appointment to discuss the topic of the project. Project representation will be on the final week.

Course Topics and Schedule


Date Topics and Papers Presenters
Week 1 (Aug 31st) Introduction and
Overview
Mingchen Gao
Week 2 (Sep 7th)

AlexNet [1], VGG [2]

SPPNet [3]

Mohammad Abuzar

Jun Zhuang

Week 3 (Sep 14th)

GoogLeNet [4], ResNet [5]

Batch Normalization [10], Dropout [11]

Aditya walke

Vivek Bheda

Week 4 (Sep 21st)

Inception-ResNet [6], MXNet [8]

Rectifiers [12], ResNext [9]

Avishkar Zanje

Jay Bakshi

Week 5 (Sep 28th)

CRF-RNN [21], DeconvNet [23]

DeepLab [22], FCN [20]

Nagadeesh Nagaraja

Chirag Yeole

Week 6 (Oct 5th)

OverFeat [13], R-CNN [14]

Fast R-CNN [15], Faster R-CNN [16]

Gaurav Nadkar

Mihir Chauhan

Week 7 (Oct 12ed)

Mask R-CNN [19], YOLO [17]

SSD [18], JAMA retinal [45]

Jayashree Chandrasekaran

Surabhi Singh Ludu

Week 8 (Oct 19th)

Deep Compression [40], SqueezeNet [41]

PointNet [28], Super-resolution [31]

Debanjan Paul

Yuhao Du

Week 9 (Oct 26th)

Transfer Learning [36], Object Localization [35]

CAM [38], NetVLAD [37]

Rimi Das

Shubham Sharma

Week 10 (Nov 2ed)

HCP [39], visulizing and understanding [32]

HED [25], deepEdge [26]

Lakshmi Prasanna Ethiraj

Vipin Kumar

Week 11 (Nov 9th)

GAN [27], video classification [34]

Deformable CNN [7], Skin Cancer Nature paper[42],

Ronak Panchal

Shivam Gupta

Week 12 (Nov 16th)

Image Captioning [24], Survey [44]

CNN for Graph [30], DenseNet [29]

Sunil Vasu

Vijit Singhal

Thanksgiving no class    
Week 13 (Nov 30th) Style Transfer [33], U-net [43] Zhenggang Xue
Week 14 (Dec 7th) Final Project Presentation  

 


Reading List

[1] A. Krizhevsky, I. Sutskever, and G. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, 2012, pp. 1097-1105.
[2] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," pp. 1-14, Sep. 2014.
[3] K. He, X. Zhang, S. Ren, and J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," pp. 1-14, Jun. 2014.
[4] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going Deeper with Convolutions," Proc. IEEE Conf. Comput. Vis. pattern Recognit., pp. 1-9, Sep. 2015.
[5] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," Proc. IEEE Conf. Comput. Vis. pattern Recognit., vol. 7, no. 3, pp. 171-180, Dec. 2015.
[6] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning," pp. 4278-4284, 2016.
[7] J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, "Deformable Convolutional Networks," ICCV, Mar. 2017.
[8] T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and Z. Zhang, "MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems," pp. 1-6, 2015.
[9] S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, "Aggregated Residual Transformations for Deep Neural Networks," 2016.
[10] S. Ioffe and C. Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," ICML, vol. XXXIII, no. 2, pp. 81-87, 2014.
[11] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: A Simple Way to Prevent Neural Networks from Overfitting," J. Mach. Learn. Res., vol. 15, pp. 1929-1958, 2014.
[12] K. He, "Delving deep into rectifiers: Surpassing human level performance on ImageNet classification," Int. Conf. Comput. Vis., 2015.
[13] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, "OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks," arXiv Prepr. arXiv, p. 1312.6229, Dec. 2013.
[14] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 580-587, 2014.
[15] R. Girshick, "Fast R-CNN," 2015.
[16] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," Adv. Neural Inf. Process. Syst., pp. 1-10, Jun. 2015.
[17] A. Redmon, Joseph and Divvala, Santosh and Girshick, Ross and Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2015.
[18] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, and S. Reed, "SSD: Single Shot MultiBox Detector," Eccv, 2015.
[19] K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN," Mar. 2017.
[20] J. Long, E. Shelhamer, and T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," IEEE Conf. Comput. Vis. Pattern Recognit., 2015.
[21] S. Zheng, S. Jayasumana, B. Romera-paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. H. S. Torr, "Conditional Random Fields as Recurrent Neural Networks."
[22] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs," Iclr, pp. 1-14, 2016.
[23] H. Noh, S. Hong, and B. Han, "Learning Deconvolution Network for Semantic Segmentation."
[24] A. Karpathy and F. F. Li, "Deep visual-semantic alignments for generating image descriptions," Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2015.
[25] S. Xie and Z. Tu, "Holistically-Nested Edge Detection," 2015.
[26] G. Bertasius and L. Torresani, "DeepEdge: A Multi-Scale Bifurcated Deep Network," CVPR, 2015.
[27] I. Goodfellow, J. Pouget-Abadie, and M. Mirza, "Generative Adversarial Networks,", 2014.
[28] C. R. Qi, "3d related point cloud data PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation."
[29] G. Huang, Z. Liu, and K. Q. Weinberger, "Densely Connected Convolutional Networks," arXiv Prepr., pp. 1-12, 2016.
[30] M. Niepert, M. Ahmed, and K. Kutzkov, "Learning Convolutional Neural Networks for Graphs," vol. 1, 2016.
[31] C. Dong, C. C. Loy, K. He, and X. Tang, "Learning a Deep Convolutional Network for Image Super-Resolution," Comput. Vision-ECCV 2014, vol. 8689, pp. 184-199, 2014.
[32] M. D. Zeiler and R. Fergus, "Visualizing and Understanding Convolutional Networks," arXiv Prepr. arXiv1311.2901, 2013.
[33] J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual losses for real-time style transfer and super-resolution," Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9906 LNCS, pp. 694-711, 2016.
[34] A. Karpathy and G. Toderici, "Large-scale video classification with convolutional neural networks," pp. 1725-1732, 2014.
[35] M. Oquab, F. L. Bottou, I. Laptev, and F. J. Sivic, "Is object localization for free? - Weakly-supervised learning with convolutional neural networks," CVPR, no. iii, pp. 685-694, 2015.
[36] M. Oquab, L. Bottou, I. Laptev, and J. Sivic, "Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks," CVPR, pp. 1717-1724, 2014.
[37] R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, "NetVLAD: CNN architecture for weakly supervised place recognition," IEEE Trans. Pattern Anal. Mach. Intell., pp. 5297-5307, 2017.
[38] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, "Learning Deep Features for Discriminative Localization," arXiv1512.04150 [cs], pp. 2921-2929, 2015.
[39] C. Biernacki, "HCP: A Flexible CNN Framework for Multi-Label Image Classificatio," vol. 22, no. 7, pp. 719-725, 2000.
[40] S. Han, H. Mao, and W. J. Dally, "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding," pp. 1-14, 2015.
[41] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, "50 X Fewer Parameters and < 0 . 5Mb Model Size," pp. 1-13, 2017.
[42] A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun, "Dermatologist-level classification of skin cancer with deep neural networks," Nature, vol. 542, no. 7639, pp. 115-118, Jan. 2017.
[43] O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," Med. Image Comput. Comput. Interv. -- MICCAI 2015, pp. 234-241, 2015.
[44] G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. W. M. van der Laak, B. van Ginneken, and C. I. Sanchez, "A Survey on Deep Learning in Medical Image Analysis," no. 1995, 2017.
[45] V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, J. Cuadros, R. Kim, R. Raman, P. C. Nelson, J. L. Mega, and D. R. Webster, "Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.," Jama, vol. 304, no. 6, pp. 649-656, 2016.