Optimization with Momentum

Figure 1: Gradient Descent and RMSProp algorithms with and without momentum

We saw an example of Gradient Descent with Optimization. Another algorithm which supports momentum optimization is RMSProp (Root Mean Square Propagation).
In this example we will use both the algorithms with optimization to find minimum of a non-convex function.
We can see that algorithms with momentum are able to find the global minima whereas without momentum are reaching only local minima.
We can also note that Gradient Descent with momentum is converging faster.
Below program shows usage of SGD and RMSProp with and without momentum.

Implementation of Optimization with Momentum