Mini-batch: Difference between revisions

Mini-batch (view source)

Revision as of 18:02, 28 February 2023

7 bytes added , 28 February 2023

no edit summary

Interface administrators, Administrators (Semantic MediaWiki), Curators (Semantic MediaWiki), Editors (Semantic MediaWiki), Suppressors, Administrators

8,021

edits

@@ Line 51: / Line 51: @@
 Mini-batch training has several advantages over batch training, which involves running the model on all data at once. Some of these advantages include:
-===Faster convergence==
+===Faster convergence===
 Mini-batch training typically converges faster than batch training, since the model is updated more frequently. This is because the gradients calculated on a mini-batch are more representative of the entire dataset than those calculated from individual data points.
-===Less memory usage==
+===Less memory usage===
 Mini-batch training uses less memory than batch processing, as only a portion of the dataset is loaded into memory at once. This enables larger models to be trained using limited computational resources.
-===Improved generalization==
+===Improved generalization===
 Mini-batch training can improve generalization performance as the model is exposed to a wider variety of examples during instruction, making it more likely that it will learn how to apply its knowledge to new, unseen situations.
-===Better optimization==
+===Better optimization===
 Mini-batch training can aid the optimization process, as the stochastic nature of gradients computed from mini-batches can help the model escape local minima and find more favorable solutions.
@@ Line 66: / Line 66: @@
 Mini-batch training has its advantages, but also some drawbacks, such as:
-===Hyperparameter tuning==
+===Hyperparameter tuning===
 The selection of mini-batch size is an important parameter that needs to be optimized for optimal performance. This can be a laborious process requiring extensive experimentation.
-===Noise==
+===Noise===
 Mini-batch training can be noisy due to the approximations made when computing gradients from a mini-batch; this may lead to oscillations in the learning process and slower convergence speeds.
-===Hardware requirements==
+===Hardware requirements===
 Mini-batch training requires more CPU/GPU memory than training on a single data point, making it difficult to train large models with limited hardware resources.