Clipping: Difference between revisions

Clipping (view source)

4 bytes added , 25 February 2023

m

Text replacement - " unstable " to " unstable "

7,785

edits

@@ Line 4: / Line 4: @@
 ==The Need for Clipping==
-Machine learning algorithms like [[stochastic gradient descent]] (SGD) are commonly employed to update the weights of a neural network during training. SGD works by computing the [[gradient]] of the [[loss function]] with respect to the network's weights and adjusting them in the direction of that negative gradient in order to minimize [[loss]]. However, if this gradient is very large, it could cause weights to change rapidly, leading to unstable behavior.
+Machine learning algorithms like [[stochastic gradient descent]] (SGD) are commonly employed to update the weights of a neural network during training. SGD works by computing the [[gradient]] of the [[loss function]] with respect to the network's weights and adjusting them in the direction of that negative gradient in order to minimize [[loss]]. However, if this gradient is very large, it could cause weights to change rapidly, leading to [[unstable]] behavior.
 This problem is especially prevalent in [[deep neural network]]s, which can have millions of weights that make it difficult to keep them under control during training. Clipping offers a straightforward solution by restricting the range of the weights and preventing them from growing too large.