Lecture 22: Gradient Descent: Downhill to a Minimum

Description

Gradient descent is the most common optimization algorithm in deep learning and machine learning. It only takes into account the first derivative when performing updates on parameters—the stepwise process that moves downhill to reach a local minimum.

Summary

Gradient descent: Downhill from \(x\) to new \(X = x - s (\partial F / \partial x)\)
Excellent example: \(F(x,y) = \frac{1}{2} (x^2 + by^2)\)
If \(b\) is small we take a zig-zag path toward (0, 0).
Each step multiplies by \((b - 1)/(b + 1)\)
Remarkable function: logarithm of determinant of \(X\)

Related section in textbook: VI.4

Instructor: Prof. Gilbert Strang

Course Info

keyboard_arrow_right

Instructor:	Prof. Gilbert Strang
Course Number:	18.065 18.0651
Departments:	Mathematics
Topics:	Engineering > Electrical Engineering > Signal Processing Mathematics > Applied Mathematics Mathematics > Computation Mathematics > Linear Algebra
As Taught In:	Spring 2018
Level:	Undergraduate

Topics

Course Features

AV lectures - Video

Assignments - problem sets (no solutions)

AV special element audio - Podcast

Browse Course Material

Course Info

Topics

Course Features

Description

Summary

Course Info

Topics

Course Features