The compositional structure illustrated in the VGG network, and used in all of deep learning (this is where the “deep” comes from), has been incredibly successful in machine learning and artificial intelligence tasks over the last decade.
To answer the question of why have a course on the mathematics of deep learning, it is first instructive to consider why have a course on deep learning at all. To many of you, the answer to this question may seem clear or the question itself rhetorical. Indeed, already at MSU there are several courses on deep learning or that address deep learning as part of their course content, including courses in CSE, ECE, and MTH. Nevertheless, let us consider this question for a moment. Deep learning refers to a class of algorithms in machine learning and more generally artificial intelligence. One of the hallmarks of deep learning algorithms is that they compose a sequence of many simple functions, often alternating between linear or affine functions, point-wise nonlinear functions, and pooling operations. Figure 1 gives an illustration of the VGG16 network [1], which is a powerful and very popular convolutional neural network, that consists of 16 layers of linear/non-linear pairs of operations, as well as pooling operations (where the length and width of the image stacks shrink) every few layers. Note that all of the linear functions are learned from the given training data and the associated task, which for VGG16 was image classification on the the ImageNet data base [2] (more on this later). Thus the the input to the VGG16 network is an RGB image, and the output is a class label. The compositional structure illustrated in the VGG network, and used in all of deep learning (this is where the “deep” comes from), has been incredibly successful in machine learning and artificial intelligence tasks over the last decade. Indeed, deep learning is now used in a multitude of different contexts, from computer vision to natural language processing to playing games to biology to physics and more. One of the most striking examples of the success of deep learning (and in this case, reinforcement learning), is the success of AlphaGo