Interface administrators, Administrators (Semantic MediaWiki), Curators (Semantic MediaWiki), Editors (Semantic MediaWiki), Suppressors, Administrators
7,785
edits
No edit summary |
m (Text replacement - "Category:Machine learning terms" to "Category:Machine learning terms Category:not updated") |
||
(5 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
{{see also|Machine learning terms}} | |||
==Introduction== | ==Introduction== | ||
In [[machine learning]], categorical data represents qualitative or nominal [[feature]]s rather than numerical or [[continuous feature]]s. It often represents attributes or characteristics of objects or events which cannot be quantified quantitatively. Categorical data plays an essential role in many machine learning tasks such as [[classification]], [[clustering]] and [[regression]]. | |||
Categorical data is sometimes known as [[discrete feature]]s. | |||
==Types of Categorical Data== | ==Types of Categorical Data== | ||
Line 6: | Line 9: | ||
==Representation of Categorical Data== | ==Representation of Categorical Data== | ||
Categorical data is commonly represented with one-hot encoding, which transforms each category value into a binary vector of 0s and 1s. Each binary vector has the same length as the number of categories, with 1 being placed at each position corresponding to that category. For instance, if we have categorical variable representing car colors (red, blue, and green), one-hot encoding could read red = [1, 0, | Categorical data is commonly represented with [[one-hot encoding]], which transforms each category value into a binary vector of 0s and 1s. Each binary vector has the same length as the number of categories, with 1 being placed at each position corresponding to that category. For instance, if we have categorical variable representing car colors (red, blue, and green), one-hot encoding could read red = [1, 0, 0], blue = [0, 1, 0], and green = [0, 0, 1]. | ||
==Applications of Categorical Data== | ==Applications of Categorical Data== | ||
Line 13: | Line 16: | ||
==Explain Like I'm 5 (ELI5)== | ==Explain Like I'm 5 (ELI5)== | ||
Categorical data is like different kinds of candy. Nominal candy looks like different colors of M&M's with no order; ordinal candy has an established hierarchy from small to large. We use categorical data in computer programs to understand things that cannot be quantified numerically - such as what something is, group similar items together, or estimate how much something costs based on other similar things. | Categorical data is like different kinds of candy. Nominal candy looks like different colors of M&M's with no order; ordinal candy has an established hierarchy from small to large. We use categorical data in computer programs to understand things that cannot be quantified numerically - such as what something is, group similar items together, or estimate how much something costs based on other similar things. | ||
[[Category:Terms]] [[Category:Machine learning terms]] [[Category:not updated]] |