Lecture 22: Gradient Descent: Downhill to a Minimum
Description
Gradient descent is the most common optimization algorithm in deep learning and machine learning. It only takes into account the first derivative when performing updates on parameters—the stepwise process that moves downhill to reach a local minimum.
Summary
Gradient descent: Downhill from \(x\) to new \(X = x - s (\partial F / \partial x)\)
Excellent example: \(F(x,y) = \frac{1}{2} (x^2 + by^2)\)
If \(b\) is small we take a zig-zag path toward (0, 0).
Each step multiplies by \((b - 1)/(b + 1)\)
Remarkable function: logarithm of determinant of \(X\)
Related section in textbook: VI.4
Instructor: Prof. Gilbert Strang
Instructor: | |
Course Number: |
|
Departments: | |
Topics: | |
As Taught In: | Spring 2018 |
Level: | Undergraduate |
Topics
Course Features
record_voice_over
AV lectures - Video
assignment_turned_in
Assignments - problem sets (no solutions)
equalizer
AV special element audio - Podcast