Interface administrators, Administrators (Semantic MediaWiki), Curators (Semantic MediaWiki), Editors (Semantic MediaWiki), Suppressors, Administrators
7,785
edits
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
==Introduction== | ==Introduction== | ||
In [[machine learning]], categorical data represents qualitative or nominal [[features]] rather than numerical or [[continuous features]]. It often represents attributes or characteristics of objects or events which cannot be quantified quantitatively. Categorical data plays an essential role in many machine learning tasks such as [[classification]], [[clustering]] and [[regression]]. | |||
Categorical data is sometimes known as [[discrete feature]]s. | |||
==Types of Categorical Data== | ==Types of Categorical Data== | ||
Line 6: | Line 8: | ||
==Representation of Categorical Data== | ==Representation of Categorical Data== | ||
Categorical data is commonly represented with one-hot encoding, which transforms each category value into a binary vector of 0s and 1s. Each binary vector has the same length as the number of categories, with 1 being placed at each position corresponding to that category. For instance, if we have categorical variable representing car colors (red, blue, and green), one-hot encoding could read red = [1, 0, | Categorical data is commonly represented with [[one-hot encoding]], which transforms each category value into a binary vector of 0s and 1s. Each binary vector has the same length as the number of categories, with 1 being placed at each position corresponding to that category. For instance, if we have categorical variable representing car colors (red, blue, and green), one-hot encoding could read red = [1, 0, 0], blue = [0, 1, 0], and green = [0, 0, 1]. | ||
==Applications of Categorical Data== | ==Applications of Categorical Data== |