Z-score normalization: Difference between revisions

Z-score normalization (view source)

Revision as of 21:37, 22 February 2023

30 bytes added , 22 February 2023

no edit summary

Interface administrators, Administrators (Semantic MediaWiki), Curators (Semantic MediaWiki), Editors (Semantic MediaWiki), Suppressors, Administrators

7,785

edits

@@ Line 1: / Line 1: @@
 {{see also|Machine learning terms}}
 ==Introduction==
-Z-score normalization is a type of data scaling that transforms data values to have a mean of zero and standard deviation of one. This transformation occurs by subtracting the mean from each value and dividing by its standard deviation. The results are known as Z-scores, which indicate how far away from the mean each data point is.
+[[Z-score normalization]] is a type of [[data scaling]] that transforms [[data]] values to have a [[mean]] of zero and [[standard deviation]] of one. This transformation occurs by subtracting the mean from each value and dividing by its standard deviation. The results are known as [[Z-score]]s, which indicate how far away from the mean each data point is.
-Data normalization in machine learning is a critical preprocessing step that helps boost the performance of many algorithms. Normalization involves scaling data to a specified range or distribution to reduce the impact of differences in scale or units of features.
+Data [[normalization]] in [[machine learning]] is a critical preprocessing step that helps boost the performance of many [[algorithm]]s. Normalization involves scaling data to a specified range or distribution to reduce the impact of differences in scale or units of [[feature]]s.
 ==Example==
@@ Line 23: / Line 23: @@
 ==Why is Z-score normalization used?==
-Z-score normalization is a technique commonly used in machine learning to address the issue of feature scaling. When features in a dataset have different scales or units, it can cause issues for certain machine learning algorithms that rely on distance-based calculations such as k-nearest neighbors (KNN) or support vector machines (SVM), which require equal weighting across all features in the analysis. With Z-score normalization, however, we can standardize these dimensions so that each contributes equally to our analysis.
+Z-score normalization is a technique commonly used in machine learning to address the issue of [[feature scaling]]. When features in a dataset have different scales or units, it can cause issues for certain machine learning algorithms that rely on distance-based calculations such as [[k-nearest neighbors]] (KNN) or [[support vector machine]]s (SVM), which require equal weighting across all features in the analysis. With Z-score normalization, however, we can standardize these dimensions so that each contributes equally to our analysis.
 ==How is Z-score normalization performed?==
 Z-score normalization is a straightforward formula that can be applied to each feature within an array. It consists of:
-$z = (x - mu) / sigma$
+Z = (x - µ) / σ
-Where: ($z$ is the Z-score for a particular data value); ($x$ is its original data value); and ($mu$ stands for mean of all features in that feature; while $sigma$ stands for standard deviation of those data values).
+*Z is the Z-score for a particular data value
+*x is its original data value
+*µ stands for mean of all data values in that feature
+*σ stands for standard deviation of those data values for the feature
 To apply Z-score normalization to a dataset, we must perform the following steps:
-. Calculate the mean and standard deviation for each feature in the dataset.
+#Calculate the mean and standard deviation for each feature in the [[dataset]].
-. For each data value within a feature, subtract its mean value and divide by its standard deviation.
+#For each data value within a feature, subtract its mean value and divide by its standard deviation.
-. These values correspond to Z-scores for each data point.
+#These values correspond to Z-scores for each data point.
 ==Example==