4.2 Multilayer Neural Networks (Part 1-3)

Slides

References

Multilayer perceptrons can approximate any continuous function: Hornik (1989), Multilayer Feedforward Networks are Universal Approximators , https://www.cs.cmu.edu/~epxing/Class/10715/reading/Kornick_et_al.pdf

What we covered in this video lecture

In this lecture, we discussed the limitations of models we covered earlier in this course: the perceptron and logistic regression models. Multilayer networks help us to overcome these. (If you wonder what limitations we are talking about, you get to answer this in the quiz!)

We then discussed the advantages and disadvantages of designing wide versus deep neural networks. Here, width refers to the number of hidden units in the hidden layers. And depth refers to the number of layers.

Lastly, we also discussed different architecture design considerations, for example, using different (or no) nonlinear activation functions and the importance of random weight initialization.

Additional resources if you want to learn more

If you are interested in learning more about the different activation functions, as teasered in 4.2 Part 2, I recommend this A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning.

Note that it is possible to override PyTorch’s default weight initialization scheme using the following code:

def weights_init(m):
    if isinstance(m, torch.nn.Linear):
        torch.nn.init.*(m.weight)
        torch.nn.init.*(m.bias)

model.apply(weights_init)

The * above is a placeholder for a weight initialization function in PyTorch. Which weight initialization function should be used depends on the activation function. For example, a common choice for ReLU activations is kaiming initialization:

nn.init.kaiming_normal_(m.weight.data, nonlinearity='relu')
nn.init.constant_(m.bias.data, 0)

You can find out more about Kaiming initialization in the paper

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun (2015). Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, https://arxiv.org/abs/1502.01852v1.

Log in or create a free Lightning.ai account to access:

Quizzes
Completion badges
Progress tracking
Additional downloadable content
Additional AI education resources
Notifications when new units are released
Free cloud computing credits

4.2 Multilayer Neural Networks (Part 1-3)

Slides

References

Quiz: 4.2 Multilayer Neural Networks and Why We Need Them (PART 1)

Quiz: 4.2 Multilayer Neural Networks and Why We Need Them (PART 2)

Quiz: 4.2 Multilayer Neural Networks and Why We Need Them (PART 3)

Watch Video 1 Mark complete and go to Unit 4.3 →

Videos

Follow along in a Lightning Studio

DL Fundamentals 4: Training Multilayer Neural Networks

Questions or Feedback?