Generalization curve: Difference between revisions

No edit summary
Line 17: Line 17:
The generalization curve can be used to identify the optimal model complexity that balances bias-variance tradeoff. A model with low complexity has high bias but low variance, while one with high complexity has both low bias and high variance. The optimal level of complexity will be identified as the point on the generalization curve where [[validation error]] is lowest.
The generalization curve can be used to identify the optimal model complexity that balances bias-variance tradeoff. A model with low complexity has high bias but low variance, while one with high complexity has both low bias and high variance. The optimal level of complexity will be identified as the point on the generalization curve where [[validation error]] is lowest.


The generalization curve can also be used to detect overfitting, which occurs when a model exhibits low [[training error]] but high validation error. Overfitting may be caused by either too complex a model or by having too small of a dataset. By recognizing when your model starts overfitting, you may be able to select an easier alternative or collect more data.
The generalization curve can be used to detect [[overfitting]] and [[underfitting]]. Overfitting occurs when the model is too complex, fitting the training data perfectly but failing to generalize well to new data. The model performs well on the training set but poorly on validation set. Conversely, underfitting occurs when there's too little complexity involved and doesn't fit training data well - leading to low performance across both sets of training data and validation data.


==Explain Like I'm 5 (ELI5)==
==Explain Like I'm 5 (ELI5)==