Unsupervised machine learning: Difference between revisions

no edit summary
(Created page with "{{see also|Machine learning terms}} ===Introduction== Machine learning is a subset of artificial intelligence (AI), which allows computer programs to learn from data without being explicitly programmed. Machine learning models are trained using labeled data - that is, data that has been classified or labeled according to certain criteria. Unfortunately, not all data has been labeled and sometimes it's impossible to manually label it; in such cases unsupervised machine le...")
 
No edit summary
Line 1: Line 1:
{{see also|Machine learning terms}}
{{see also|Machine learning terms}}
===Introduction==
==Introduction==
Machine learning is a subset of artificial intelligence (AI), which allows computer programs to learn from data without being explicitly programmed. Machine learning models are trained using labeled data - that is, data that has been classified or labeled according to certain criteria. Unfortunately, not all data has been labeled and sometimes it's impossible to manually label it; in such cases unsupervised machine learning can be employed instead to uncover patterns and relationships within the data.
[[Unsupervised machine learning]] or '''unsupervised training''' is a type of [[machine learning]] in which the [[model]] is [[trained]] using [[unlabeled data]]. Unsupervised learning aims to recognize patterns or relationships without prior knowledge about their [[label]]s or categories. Unsupervised learning can be especially useful when there is no preexisting knowledge about the data and manual labeling would be too time-consuming, costly or impossible.


==What is Unsupervised Machine Learning?==
Unsupervised learning involves giving a model an array of data points and asking it to discover structure or relationships within it. Without any prior knowledge, the model must discover patterns on its own. Furthermore, there is no feedback regarding the [[accuracy]] of its predictions since there are no labels with which to compare them.
Unsupervised machine learning is a type of machine learning in which the model is trained using unlabeled data. The goal of unsupervised learning is to recognize patterns or relationships without prior knowledge about their labels or categories. Unsupervised learning can be especially useful when there is no preexisting knowledge about the data and manual labeling would be too time-consuming or impossible.


Unsupervised learning involves giving a model an array of data points and asking it to discover structure or relationships within it. Without any prior knowledge, the model must discover patterns on its own. Furthermore, there is no feedback regarding the accuracy of its predictions since there are no labels with which to compare them.
The opposite of unsupervised machine learning is [[supervised machine learning]].


==Types of Unsupervised Machine Learning==
==Types of Unsupervised Machine Learning==
Unsupervised machine learning typically consists of two primary processes: clustering and dimensionality reduction.
Unsupervised machine learning typically consists of two primary processes: clustering and dimensionality reduction.


===Clustering==
===Clustering===
Clustering is an unsupervised learning technique used to group similar data points together. The objective of clustering is to discover natural groupings within the data. Clustering can be beneficial for tasks such as customer segmentation, anomaly detection and image segmentation.
Clustering is an unsupervised learning technique used to group similar data points together. The objective of clustering is to discover natural groupings within the data. Clustering can be beneficial for tasks such as customer segmentation, anomaly detection and image segmentation.


Clustering algorithms range from k-means to hierarchical clustering and density-based clustering. K-means is one of the most popular algorithms, breaking data up into k clusters which represent similar data points. Hierarchical clustering creates a treelike structure of clusters with one root node representing all data points and leaf nodes representing individual points. Density-based clustering works by identifying high density regions within data.
Clustering algorithms range from k-means to hierarchical clustering and density-based clustering. K-means is one of the most popular algorithms, breaking data up into k clusters which represent similar data points. Hierarchical clustering creates a treelike structure of clusters with one root node representing all data points and leaf nodes representing individual points. Density-based clustering works by identifying high density regions within data.


===Dimensionality Reduction==
===Dimensionality Reduction===
Dimensionality reduction is an unsupervised learning technique used to reduce the number of features in data. The objective is to simplify the information while maintaining as much detail as possible. Dimensionality reduction can be beneficial for tasks such as data visualization, noise reduction, and feature extraction.
Dimensionality reduction is an unsupervised learning technique used to reduce the number of features in data. The objective is to simplify the information while maintaining as much detail as possible. Dimensionality reduction can be beneficial for tasks such as data visualization, noise reduction, and feature extraction.