Link Search Close Search Menu Expand Document

Gradient Descent Learning Rates

Figure 1: Gradient descent with different learning rates



Gradient Descent is defined mathematically as:

$$ x_{n+1} = x_n - \alpha * \frac {d f(x)}{dx_n} $$

Where $\alpha$ is learning rate. Learning rate is a hyperparameter. Value of learning rate controls the steps size. Having a good learning rate is important because very low value of learning rate may take a long time to converge and having a very high value may not converge at all.

In Figure 1 we can see the impact of different learning rates on gradient descent.

Below program shows different learning rates for gradient descent using Tensorflow library.

Implementation of Gradient Descent Learning Rates



Back to top

Copyright © 2020-2021 Gajanan Bhat. All rights reserved.