Description
Deep learning is providing exciting solutions for visual recognition problems and is seen as a key method for future applications. This course is a deep dive into the state-of-the-art algorithms in the convolutional neural network (CNN) architectures, image classification, object detection and segmentation. Medical imaging analysis, as a unique application of computer vision to provide computer-aided analysis, will also be covered. The course will be beneficial to students interested in understanding the massive amount of literature in computer vision and deep learning.
Prerequisites
Require fundamental knowledge about machine learning, computer vision and deep learning. Prior knowledge about medical imaging is not necessary. Preference will be given to current PhD students and Master students planning to apply for a PhD. I hope that some students from this seminar will join my research group as PhD students starting in spring 2018.
Grading
Only satisfactory (S) and unsatisfactory (U) will be given. Each student will be expected to present 2-3 papers and lead discussions. Students taking more than one credit need to complete a course project. Please come to the instructor’s office hours or make an appointment to discuss the topic of the project. Project representation will be on the final week.
Date | Topics and Papers | Presenters |
---|---|---|
Week 1 (Aug 31st) | Introduction and Overview |
Mingchen Gao |
Week 2 (Sep 7th) | AlexNet [1], VGG [2] SPPNet [3] |
Mohammad Abuzar Jun Zhuang |
Week 3 (Sep 14th) | GoogLeNet [4], ResNet [5] Batch Normalization [10], Dropout [11] |
Aditya walke Vivek Bheda |
Week 4 (Sep 21st) | Inception-ResNet [6], MXNet [8] Rectifiers [12], ResNext [9] |
Avishkar Zanje Jay Bakshi |
Week 5 (Sep 28th) | CRF-RNN [21], DeconvNet [23] DeepLab [22], FCN [20] |
Nagadeesh Nagaraja Chirag Yeole |
Week 6 (Oct 5th) | OverFeat [13], R-CNN [14] Fast R-CNN [15], Faster R-CNN [16] |
Gaurav Nadkar Mihir Chauhan |
Week 7 (Oct 12ed) | Mask R-CNN [19], YOLO [17] SSD [18], JAMA retinal [45] |
Jayashree Chandrasekaran Surabhi Singh Ludu |
Week 8 (Oct 19th) | Deep Compression [40], SqueezeNet [41] PointNet [28], Super-resolution [31] |
Debanjan Paul Yuhao Du |
Week 9 (Oct 26th) | Transfer Learning [36], Object Localization [35] CAM [38], NetVLAD [37] |
Rimi Das Shubham Sharma |
Week 10 (Nov 2ed) | HCP [39], visulizing and understanding [32] HED [25], deepEdge [26] |
Lakshmi Prasanna Ethiraj Vipin Kumar |
Week 11 (Nov 9th) | GAN [27], video classification [34] Deformable CNN [7], Skin Cancer Nature paper[42], |
Ronak Panchal Shivam Gupta |
Week 12 (Nov 16th) | Image Captioning [24], Survey [44] CNN for Graph [30], DenseNet [29] |
Sunil Vasu Vijit Singhal |
Thanksgiving no class | ||
Week 13 (Nov 30th) | Style Transfer [33], U-net [43] | Zhenggang Xue |
Week 14 (Dec 7th) | Final Project Presentation |
Reading List
[1] A. Krizhevsky, I. Sutskever, and G. Hinton, "Imagenet classification with deep convolutional neural networks," in Advances in neural information processing systems, 2012, pp. 1097-1105.
[2] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," pp. 1-14, Sep. 2014.
[3] K. He, X. Zhang, S. Ren, and J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," pp. 1-14, Jun. 2014.
[4] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going Deeper with Convolutions," Proc. IEEE Conf. Comput. Vis. pattern Recognit., pp. 1-9, Sep. 2015.
[5] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," Proc. IEEE Conf. Comput. Vis. pattern Recognit., vol. 7, no. 3, pp. 171-180, Dec. 2015.
[6] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning," pp. 4278-4284, 2016.
[7] J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, "Deformable Convolutional Networks," ICCV, Mar. 2017.
[8] T. Chen, M. Li, Y. Li, M. Lin, N. Wang, M. Wang, T. Xiao, B. Xu, C. Zhang, and Z. Zhang, "MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems," pp. 1-6, 2015.
[9] S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, "Aggregated Residual Transformations for Deep Neural Networks," 2016.
[10] S. Ioffe and C. Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," ICML, vol. XXXIII, no. 2, pp. 81-87, 2014.
[11] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: A Simple Way to Prevent Neural Networks from Overfitting," J. Mach. Learn. Res., vol. 15, pp. 1929-1958, 2014.
[12] K. He, "Delving deep into rectifiers: Surpassing human level performance on ImageNet classification," Int. Conf. Comput. Vis., 2015.
[13] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, "OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks," arXiv Prepr. arXiv, p. 1312.6229, Dec. 2013.
[14] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 580-587, 2014.
[15] R. Girshick, "Fast R-CNN," 2015.
[16] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," Adv. Neural Inf. Process. Syst., pp. 1-10, Jun. 2015.
[17] A. Redmon, Joseph and Divvala, Santosh and Girshick, Ross and Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2015.
[18] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, and S. Reed, "SSD: Single Shot MultiBox Detector," Eccv, 2015.
[19] K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN," Mar. 2017.
[20] J. Long, E. Shelhamer, and T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," IEEE Conf. Comput. Vis. Pattern Recognit., 2015.
[21] S. Zheng, S. Jayasumana, B. Romera-paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. H. S. Torr, "Conditional Random Fields as Recurrent Neural Networks."
[22] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs," Iclr, pp. 1-14, 2016.
[23] H. Noh, S. Hong, and B. Han, "Learning Deconvolution Network for Semantic Segmentation."
[24] A. Karpathy and F. F. Li, "Deep visual-semantic alignments for generating image descriptions," Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2015.
[25] S. Xie and Z. Tu, "Holistically-Nested Edge Detection," 2015.
[26] G. Bertasius and L. Torresani, "DeepEdge: A Multi-Scale Bifurcated Deep Network," CVPR, 2015.
[27] I. Goodfellow, J. Pouget-Abadie, and M. Mirza, "Generative Adversarial Networks,", 2014.
[28] C. R. Qi, "3d related point cloud data PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation."
[29] G. Huang, Z. Liu, and K. Q. Weinberger, "Densely Connected Convolutional Networks," arXiv Prepr., pp. 1-12, 2016.
[30] M. Niepert, M. Ahmed, and K. Kutzkov, "Learning Convolutional Neural Networks for Graphs," vol. 1, 2016.
[31] C. Dong, C. C. Loy, K. He, and X. Tang, "Learning a Deep Convolutional Network for Image Super-Resolution," Comput. Vision-ECCV 2014, vol. 8689, pp. 184-199, 2014.
[32] M. D. Zeiler and R. Fergus, "Visualizing and Understanding Convolutional Networks," arXiv Prepr. arXiv1311.2901, 2013.
[33] J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual losses for real-time style transfer and super-resolution," Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 9906 LNCS, pp. 694-711, 2016.
[34] A. Karpathy and G. Toderici, "Large-scale video classification with convolutional neural networks," pp. 1725-1732, 2014.
[35] M. Oquab, F. L. Bottou, I. Laptev, and F. J. Sivic, "Is object localization for free? - Weakly-supervised learning with convolutional neural networks," CVPR, no. iii, pp. 685-694, 2015.
[36] M. Oquab, L. Bottou, I. Laptev, and J. Sivic, "Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks," CVPR, pp. 1717-1724, 2014.
[37] R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, "NetVLAD: CNN architecture for weakly supervised place recognition," IEEE Trans. Pattern Anal. Mach. Intell., pp. 5297-5307, 2017.
[38] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba, "Learning Deep Features for Discriminative Localization," arXiv1512.04150 [cs], pp. 2921-2929, 2015.
[39] C. Biernacki, "HCP: A Flexible CNN Framework for Multi-Label Image Classificatio," vol. 22, no. 7, pp. 719-725, 2000.
[40] S. Han, H. Mao, and W. J. Dally, "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding," pp. 1-14, 2015.
[41] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, "50 X Fewer Parameters and < 0 . 5Mb Model Size," pp. 1-13, 2017.
[42] A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun, "Dermatologist-level classification of skin cancer with deep neural networks," Nature, vol. 542, no. 7639, pp. 115-118, Jan. 2017.
[43] O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," Med. Image Comput. Comput. Interv. -- MICCAI 2015, pp. 234-241, 2015.
[44] G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. W. M. van der Laak, B. van Ginneken, and C. I. Sanchez, "A Survey on Deep Learning in Medical Image Analysis," no. 1995, 2017.
[45] V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venugopalan, K. Widner, T. Madams, J. Cuadros, R. Kim, R. Raman, P. C. Nelson, J. L. Mega, and D. R. Webster, "Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.," Jama, vol. 304, no. 6, pp. 649-656, 2016.