Bias (ethics/fairness): Difference between revisions

no edit summary
No edit summary
No edit summary
Line 6: Line 6:
Biases can arise during the creation and deployment of machine learning models.
Biases can arise during the creation and deployment of machine learning models.


1. Data Bias: This occurs when the training data used to develop a model is not representative of the population it will be applied to. For instance, if a model is trained on predominantly white people images, it may not perform well when applied to images with darker skin tones.
1. Data Bias: This occurs when the [[training data]] used to develop a model is not representative of the population it will be applied to. For instance, if a model is trained on predominantly white people images, it may not perform well when applied to images with darker skin tones.


2. Algorithm Bias: This occurs when the algorithm used to train a model is biased towards certain outcomes or predictions. For instance, if the model is trained to predict loan defaults and its training data contains an unusually high number of defaults among certain races, then it may be more likely to predict defaults among members of that race than other individuals.
2. Algorithm Bias: This occurs when the algorithm used to train a model is biased towards certain outcomes or predictions. For instance, if the model is trained to predict loan defaults and its training data contains an unusually high number of defaults among certain races, then it may be more likely to predict defaults among members of that race than other individuals.
Line 15: Line 15:
There are various techniques available to reduce bias in machine learning:
There are various techniques available to reduce bias in machine learning:


1. Data Preprocessing: One way to reduce data bias is preprocessing training data so it more accurately represents the population the model will be applied to. This may involve techniques such as oversampling or undersampling, or creating synthetic training data from scratch in order to generate more diverse training data sets.
1. Data Preprocessing: One way to reduce data bias is preprocessing training data so it more accurately represents the population the model will be applied to. This may involve techniques such as [[oversampling]] or [[undersampling]], or creating synthetic training data from scratch in order to generate more diverse training data sets.


2. Fairness Constraints: Another solution is to incorporate fairness constraints into the model training process, such as guaranteeing that predictions are not unduly influenced by certain sensitive attributes (like race or gender).
2. Fairness Constraints: Another solution is to incorporate fairness constraints into the model training process, such as guaranteeing that predictions are not unduly influenced by certain sensitive attributes (like race or gender).