Hyperparameter: Difference between revisions

Hyperparameter (view source)

Revision as of 12:25, 24 February 2023

45 bytes added , 24 February 2023

no edit summary

Interface administrators, Administrators (Semantic MediaWiki), Curators (Semantic MediaWiki), Editors (Semantic MediaWiki), Suppressors, Administrators

7,785

edits

@@ Line 4: / Line 4: @@
 ==Definition==
-Hyperparameters are parameters set before training a machine learning model that influence its behavior and performance. Unlike regular parameters, which are learned from data during training, hyperparameters must be set by an outside party and may significantly impact the final result of the model.
+Hyperparameters are parameters set before training a [[machine learning model]] that influence its behavior and performance. Unlike regular parameters ([[weights]] and [[biases]], which are learned from data during [[training]], hyperparameters must be set by an outside party and may significantly impact the final result of the model.
 ==Examples==
 Some common hyperparameters in machine learning include:
-* Learning rate: This hyperparameter controls the step size used to update parameters in a model during training. A high learning rate may cause the model to converge quickly, but may also overshoot its optimal solution and produce suboptimal performance. Conversely, a low learning rate could cause slow convergence or lead to suboptimal solutions being found.
+*[[Learning rate]]: This hyperparameter controls the step size used to update parameters in a model during training. A high learning rate may cause the model to converge quickly, but may also overshoot its optimal solution and produce suboptimal performance. Conversely, a low learning rate could cause slow convergence or lead to suboptimal solutions being found.
-* Number of Hidden Layers: This hyperparameter determines the number of layers in a neural network model. A deep network with many hidden layers can capture complex features and patterns in data, but may also be susceptible to overfitting. On the other hand, a shallow network with few hidden layers may be easier to train but may not capture all pertinent information present in the dataset.
+*Number of [[Hidden Layer]]s: This hyperparameter determines the number of layers in a neural network model. A deep network with many hidden layers can capture complex features and patterns in data, but may also be susceptible to overfitting. On the other hand, a shallow network with few hidden layers may be easier to train but may not capture all pertinent information present in the dataset.
-* Regularization Strength: This hyperparameter determines the strength of a penalty term used to prevent overfitting in a model. A high regularization strength can help avoid this problem, but may also lead to underfitting the training data. On the other hand, low regularization strengths may provide good fit with training data but may not generalize well to new data sources.
+*[[Regularization]] Strength: This hyperparameter determines the strength of a penalty term used to prevent overfitting in a model. A high regularization strength can help avoid this problem, but may also lead to underfitting the training data. On the other hand, low regularization strengths may provide good fit with training data but may not generalize well to new data sources.
 ==Optimization==