Lecture 33: Neural Nets and the Learning Function
Description
This lecture focuses on the construction of the learning function \(F\), which is optimized by stochastic gradient descent and applied to the training data to minimize the loss. Professor Strang also begins his review of distance matrices.
Summary
Each training sample is given by a vector \(v\).
Next layer of the net is \(F_1(v)\) = ReLU\((A_1 v + b_1)\).
\( w_1 = A_1 v + b_1\) with optimized weights in \(A_1\) and \(b_1\)
ReLU(\(w\)) = nonlinear activation function \(= \max (0,w) \)
Minimize loss function by optimizing weights \(x\)'s = \(A\)'s and \(b\)'s
Distance matrix given between points: Find the points!
Related sections in textbook: VII.1 and IV.10
Instructor: Prof. Gilbert Strang
Instructor: | |
Course Number: |
|
Departments: | |
Topics: | |
As Taught In: | Spring 2018 |
Level: | Undergraduate |
Topics
Course Features
record_voice_over
AV lectures - Video
assignment_turned_in
Assignments - problem sets (no solutions)
equalizer
AV special element audio - Podcast