Iteration: Difference between revisions

m
Text replacement - " stability " to " stability "
m (Text replacement - " stability " to " stability ")
Line 18: Line 18:
#[[Stochastic gradient descent]] (SGD): when each iteration uses only 1 [[example]] of the [[training data]]. After processing just 1 example, the model updates its weights and biases. While it is fast, SGD can be [[unstable]].
#[[Stochastic gradient descent]] (SGD): when each iteration uses only 1 [[example]] of the [[training data]]. After processing just 1 example, the model updates its weights and biases. While it is fast, SGD can be [[unstable]].
#[[Mini-batch gradient descent]]: when each iteration uses a randomly chosen subset of training data to balance speed of [[convergence]] with [[stability]] in the optimization process.
#[[Mini-batch gradient descent]]: when each iteration uses a randomly chosen subset of training data to balance speed of [[convergence]] with [[stability]] in the optimization process.
#[[Batch gradient descent]]: when each iteration uses all of the training data. This form of gradient descent offers stability but may be computationally expensive for large datasets.
#[[Batch gradient descent]]: when each iteration uses all of the training data. This form of gradient descent offers [[stability]] but may be computationally expensive for large datasets.


==Explain Like I'm 5 (ELI5)==
==Explain Like I'm 5 (ELI5)==