Lecture 27: Backpropagation: Find Partial Derivatives

Description

In this lecture, Professor Strang presents Professor Sra’s theorem which proves the convergence of stochastic gradient descent (SGD). He then reviews backpropagation, a method to compute derivatives quickly, using the chain rule.

Summary

Computational graph: Each step in computing \(F(x)\) from the weights
Derivative of each step + chain rule gives gradient of \(F\).
Reverse mode: Backwards from output to input
The key step to optimizing weights is backprop + stoch grad descent.

Related section in textbook: VII.3

Instructor: Prof. Gilbert Strang

Course Info

keyboard_arrow_right

Instructor:	Prof. Gilbert Strang
Course Number:	18.065 18.0651
Departments:	Mathematics
Topics:	Engineering > Electrical Engineering > Signal Processing Mathematics > Applied Mathematics Mathematics > Computation Mathematics > Linear Algebra
As Taught In:	Spring 2018
Level:	Undergraduate

Topics

Course Features

AV lectures - Video

Assignments - problem sets (no solutions)

AV special element audio - Podcast

Browse Course Material

Course Info

Topics

Course Features

Description

Summary

Course Info

Topics

Course Features