Random notes. Regression based techniques often involve finding a maximum (e.g., the maximum likelihood) or a minimum (e.g., least squares or mean square error) value. Gradient descent is an iterative optimization algorithm used to find the minimum of a function (or gradient ascent to find the maximum).
The algorithm for solving for (\theta_j) looks like:
[\theta_j = \theta_j - \alpha\frac{\partial{}}{\partial{\theta_j} }J(\theta)]
(\alpha) is the learning rate (smaller step size takes more iterations)