Jump to content

Gradient descent: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 1: Line 1:
{{see also|Machine learning terms}}
{{see also|Machine learning terms}}
==Introduction==
==Introduction==
Gradient descent is a popular optimization algorithm in machine learning. Its goal is to minimize the [[loss]] of the model during [[training]]. Gradient descent utilizes weights and biases as input parameters to achieve this minimal error margin.
[[Gradient descent]] is a popular [[optimization algorithm]] in [[machine learning]]. Its goal is to minimize the [[loss]] of the [[model]] during [[training]]. To accomplish this, gradient descent adjusts the [[weights]] and [[biases]] of the model during each [[training]] [[iteration]].


==How Gradient Descent Works==
==How Gradient Descent Works==
Gradient descent works by iteratively altering the parameters of a model in order to obtain steepest descent of the cost function, which measures how well it's performing. The goal of gradient descent is to find parameters that minimize this cost function.
Gradient descent works by iteratively altering the [[parameters]] of a model in order to obtain steepest descent of the [[cost function]], which measures how well it's performing. The goal of gradient descent is to find parameters that minimize this cost function.


The algorithm begins with an initial set of parameters and iteratively updates them until it reaches a minimum point in the cost function. At each iteration, a gradient of the cost function is computed with respect to those same parameters; this gradient is represented as a vector pointing in the direction of steepest increase in cost function. To minimize this cost function, parameters are updated in direct opposition to that direction.
The algorithm begins with an initial set of parameters and iteratively updates them until it reaches a minimum point in the cost function. At each iteration, a gradient of the cost function is computed with respect to those same parameters; this gradient is represented as a vector pointing in the direction of steepest increase in cost function. To minimize this cost function, parameters are updated in direct opposition to that direction.