Novelty Detection

Novelty detection is a branch of machine learning concerned with identifying test data that differ in some meaningful way from the data available during training. Unlike standard supervised learning tasks where both normal and abnormal classes are well-represented, novelty detection typically operates in a setting where only data from the "normal" class is available at training time. When a new observation arrives, the system must determine whether it belongs to the same distribution as the training data or represents something previously unseen. The problem is sometimes referred to as one-class classification, since the learning algorithm builds a model describing only the normal class and flags anything that falls outside that model's boundary as novel.

Novelty detection plays a central role in fields where abnormal events are rare, expensive to label, or inherently unpredictable. Examples include detecting previously unknown cyber attacks in network traffic, identifying manufacturing defects that have never been seen on a production line, and spotting early signs of disease from patient monitoring data. Because the system trains on clean, normal data and then evaluates new observations against that learned baseline, novelty detection is classified as a form of semi-supervised anomaly detection.

Explain like I'm 5 (ELI5)

Imagine you have a toy box full of red balls. You play with those red balls every day, so you know exactly what a red ball looks like. One day, someone drops a blue cube into your toy box. You would notice right away that the blue cube does not look like any of your red balls. Novelty detection works the same way for computers: the computer first learns what "normal" looks like by studying lots of normal examples, and then whenever something new and different shows up, it raises a flag and says, "Hey, this does not match what I have seen before!"

Novelty detection vs. anomaly detection vs. outlier detection

The terms novelty detection, anomaly detection, and outlier detection are often used interchangeably in casual discussion, but they carry distinct meanings in the machine learning literature. Understanding the differences is important for choosing the right algorithm and evaluation strategy.

Aspect	Novelty detection	Outlier detection	Anomaly detection
Training data	Clean, uncontaminated (no anomalies)	May contain outliers that need to be identified	Varies; may or may not contain anomalies
Task timing	Detects anomalies in new, unseen data after training	Identifies anomalies within the existing training dataset	General umbrella term covering both settings
Learning paradigm	Semi-supervised (one-class)	Unsupervised learning	Supervised, semi-supervised, or unsupervised
Anomaly clustering	Novel points can form dense clusters, as long as they fall in a low-density region of the training distribution	Outliers cannot form dense clusters; they must lie in low-density regions globally	Depends on the specific method
Typical use case	Monitoring a deployed system for new types of failures	Cleaning a dataset by removing erroneous or extreme values	Broad term for any task that separates normal from abnormal

In the scikit-learn documentation, novelty detection is described as semi-supervised anomaly detection, while outlier detection is described as unsupervised anomaly detection. The practical difference comes down to whether the training set is assumed to be free of anomalies (novelty detection) or whether anomalies may already be present in the training set (outlier detection).

Relationship to open-set recognition and out-of-distribution detection

Novelty detection is closely related to two other problems in modern machine learning: open-set recognition and out-of-distribution detection.

In novelty detection (also called one-class classification), the model is trained on data from a single normal class and must decide whether a new sample belongs to that class or not. Open-set recognition extends this to multiple known classes: the model is trained on K classes and must reject samples that do not belong to any of those K classes while still correctly classifying samples from the known classes. Out-of-distribution (OOD) detection addresses the same goal as open-set recognition but is typically studied in the context of deep neural networks and uses confidence scores or other signals from a pre-trained classifier to flag inputs that differ from the training distribution.

Novelty detection can be viewed as an extreme case of open-set recognition where K equals 1. All three problems share the goal of deciding whether a sample comes from the distribution seen during training, but they differ in the number of known classes and the types of models and evaluation protocols used.

Categories of methods

Novelty detection methods can be grouped into several broad categories based on how they model the normal class. The taxonomy below follows the structure described by Pimentel et al. (2014) in their review of novelty detection.

Category	Core idea	Representative methods
Probabilistic	Estimate the probability density of normal data; flag low-probability samples as novel	Gaussian mixture models, kernel density estimation, Bayesian approaches
Distance-based	Measure distances or similarities to training data; large distances indicate novelty	k-nearest neighbors, Local Outlier Factor
Domain-based	Learn a boundary or region in feature space that encloses the normal data	One-class SVM, Support Vector Data Description (SVDD)
Reconstruction-based	Learn to reconstruct normal data; high reconstruction error on a test sample indicates novelty	Autoencoders, variational autoencoders, sparse coding
Information-theoretic	Use information-theoretic measures such as entropy or Kolmogorov complexity to detect distributional changes	Entropy-based detectors, minimum description length

Each category has its own strengths and weaknesses. Probabilistic methods provide principled uncertainty estimates but can struggle in high-dimensional spaces. Distance-based methods are intuitive and nonparametric but become computationally expensive with large datasets. Domain-based methods are efficient at test time but require careful kernel selection. Reconstruction-based methods scale well with deep learning but can sometimes reconstruct anomalies too accurately if the model is overly flexible.

Probabilistic methods

Probabilistic approaches to novelty detection estimate the probability density function (PDF) of the training data and then classify a new observation as novel if its estimated density falls below a threshold.

Gaussian mixture models

A Gaussian mixture model (GMM) represents the training distribution as a weighted sum of multiple Gaussian components. Each component is defined by a mean vector and a covariance matrix. The parameters are typically estimated using the expectation-maximization (EM) algorithm. At test time, the likelihood of a new sample under the GMM is computed. If the likelihood is below a predefined threshold, the sample is flagged as novel.

GMMs are flexible enough to model multimodal distributions, making them useful when the normal data form several distinct clusters. However, the user must choose the number of components, and the method can become unreliable in high-dimensional spaces due to the difficulty of estimating covariance matrices accurately.

Kernel density estimation

Kernel density estimation (KDE) is a nonparametric method that places a kernel function (often a Gaussian) at each training point and sums them to produce a smooth estimate of the density. For a test point x, the estimated density is:

f(x) = (1 / nh) * sum(K((x - x_i) / h))

where K is the kernel function, h is the bandwidth (smoothing parameter), n is the number of training points, and x_i are the training samples. Points with density estimates below a threshold are classified as novel.

KDE makes no assumptions about the shape of the underlying distribution, which is an advantage over parametric methods like GMMs. The main drawback is that KDE does not scale well to high-dimensional data because the number of samples needed to produce reliable density estimates grows exponentially with the number of dimensions.

Elliptic envelope

The Elliptic Envelope method assumes that the normal data follow a single multivariate Gaussian distribution. It fits a robust covariance estimate to the data and uses Mahalanobis distance to measure how far a new observation lies from the center of the distribution. Points with large Mahalanobis distances are classified as outliers or novelties. This method is implemented in scikit-learn as EllipticEnvelope and works well when the Gaussian assumption holds, but it is unreliable for data with non-Gaussian structure.

Distance-based methods

Distance-based methods define novelty in terms of how far a test point is from its nearest neighbors in the training set. The assumption is that normal data points tend to be close to other normal points, while novel points tend to be far from any training example.

k-nearest neighbors

The simplest distance-based approach computes the distance from a test point to its k-th nearest neighbor in the training set. If that distance exceeds a threshold, the point is flagged as novel. Variants include using the average distance to the k nearest neighbors rather than the distance to the k-th neighbor alone.

Local Outlier Factor

The Local Outlier Factor (LOF), proposed by Breunig et al. (2000), refines the basic k-nearest-neighbor approach by comparing the local density of a point to the local densities of its neighbors. The LOF score for a point p is the ratio of the average local density of p's k nearest neighbors to the local density of p itself. A LOF score close to 1 indicates that the point has a density similar to its neighbors (normal), while a score significantly greater than 1 indicates that the point lies in a region of lower density than its neighbors (potentially novel).

LOF is effective when the data contain clusters of varying densities, because it evaluates each point relative to its local neighborhood rather than against a global threshold. In scikit-learn, the LocalOutlierFactor class supports both outlier detection and novelty detection. When the novelty parameter is set to True, the model can be used to score new, unseen data points. When novelty is False (the default), the model can only be applied to the training data itself for outlier detection.

Domain-based methods

One-class SVM

The one-class support vector machine (One-Class SVM), introduced by Scholkopf et al. (2001), learns a decision boundary that separates the training data from the origin in a high-dimensional feature space induced by a kernel function. The algorithm finds a hyperplane with maximum margin between the data points and the origin. The fraction of training points allowed to fall on the wrong side of the hyperplane is controlled by a parameter called nu, which can be interpreted as an upper bound on the fraction of outliers and a lower bound on the fraction of support vectors.

At test time, a new observation is projected into the kernel feature space, and the sign of its distance from the hyperplane determines whether it is classified as normal (positive) or novel (negative). The most common kernel choice is the radial basis function (RBF) kernel, which allows the decision boundary to take complex, nonlinear shapes in the original input space.

One-Class SVM has strong theoretical foundations rooted in statistical learning theory. However, training complexity scales roughly quadratically with the number of samples, which can make it impractical for very large datasets. For such cases, scikit-learn provides SGDOneClassSVM, a linear approximation trained with stochastic gradient descent that scales linearly with the number of samples.

Support Vector Data Description

Support Vector Data Description (SVDD), proposed by Tax and Duin (2004), takes a complementary approach to One-Class SVM. Instead of separating the data from the origin with a hyperplane, SVDD finds the smallest hypersphere that encloses most of the training data. Points that fall outside the hypersphere at test time are classified as novel. Like One-Class SVM, SVDD can be kernelized to produce flexible, nonlinear boundaries. The two methods are mathematically equivalent when using an RBF kernel, since the RBF kernel maps all data points onto the surface of a unit hypersphere in the feature space.

Reconstruction-based methods

Reconstruction-based approaches train a model to compress and then reconstruct normal data. The idea is that a model trained only on normal data will learn representations tailored to that data and will produce high reconstruction error when presented with novel inputs.

Autoencoders

An autoencoder is a neural network trained to map its input to a lower-dimensional latent representation (the bottleneck) and then reconstruct the original input from that representation. The training objective is to minimize the reconstruction error, typically measured as mean squared error or binary cross-entropy, over the normal training data.

For novelty detection, the reconstruction error serves as the novelty score. Because the autoencoder has been trained exclusively on normal data, it learns to reconstruct normal patterns accurately. When a novel input is presented, the autoencoder produces a poor reconstruction, resulting in a high reconstruction error that exceeds a predefined threshold.

Several architectural variants have been explored for novelty detection:

Standard (dense) autoencoders. Fully connected layers in both the encoder and decoder. Suitable for tabular and low-dimensional data.
Convolutional autoencoders. Use convolutional layers in the encoder and transposed convolutions in the decoder. Well-suited for image-based novelty detection.
Sparse autoencoders. Add a sparsity penalty to the latent representation, encouraging the model to use only a small number of active neurons for each input. This can improve sensitivity to subtle novelties.
Denoising autoencoders. Trained to reconstruct clean inputs from corrupted versions. The corruption process can make the learned representations more robust.

Variational autoencoders

A variational autoencoder (VAE) extends the standard autoencoder by imposing a probabilistic structure on the latent space. Instead of encoding each input as a single point, the encoder outputs the parameters (mean and variance) of a Gaussian distribution. During training, samples are drawn from this distribution and decoded back to the input space. The training objective combines reconstruction error with a Kullback-Leibler divergence term that regularizes the latent distribution toward a standard normal prior.

For novelty detection, VAEs can use either the reconstruction error or the evidence lower bound (ELBO) as a novelty score. Because the latent space is regularized, VAEs tend to produce smoother latent representations than standard autoencoders, which can improve separation between normal and novel samples. Research has shown that VAEs can outperform ordinary autoencoders for novelty detection in certain settings because the probabilistic encoding provides a richer signal for identifying distributional shifts.

Generative adversarial networks

A generative adversarial network (GAN) consists of a generator that produces synthetic data and a discriminator that attempts to distinguish real data from generated data. For novelty detection, a GAN is trained on normal data so that the generator learns to produce realistic normal samples. At test time, novelty can be assessed in several ways: by computing the reconstruction error of a test sample through the generator (as in BiGAN or AnoGAN architectures), by examining the discriminator's confidence score, or by measuring the distance between a test sample and the closest generated sample in the latent space.

GAN-based novelty detection has shown strong results in image domains, but training GANs can be unstable and sensitive to hyperparameter choices.

Isolation Forest

The Isolation Forest algorithm, proposed by Liu, Ting, and Zhou (2008), takes a fundamentally different approach to novelty and anomaly detection. Rather than modeling what normal data looks like, it explicitly isolates anomalies by exploiting their key properties: anomalies are few in number and differ significantly from normal points.

The algorithm works as follows:

A random subsample of the training data is selected.
A random feature is chosen, and a random split value between the minimum and maximum values of that feature is selected.
The data is partitioned at the split value, and the process repeats recursively on each partition until every point is isolated in its own leaf node.
This produces an isolation tree (iTree). Multiple iTrees are built from different random subsamples to create an isolation forest.
The anomaly score of a point is based on its average path length across all trees. Anomalies, being few and different, are easier to isolate and therefore have shorter average path lengths.

The expected path length for normal points in a balanced binary tree of n samples is approximately 2H(n-1) - 2(n-1)/n, where H(k) is the harmonic number. The anomaly score s for a data point x is defined as s(x, n) = 2^(-E(h(x)) / c(n)), where E(h(x)) is the average path length and c(n) is the expected path length for a normal point. A score close to 1 indicates an anomaly, while a score close to 0.5 indicates a normal point.

Property	Value
Training time complexity	O(t * psi * log(psi)), where t is the number of trees and psi is the subsample size
Prediction time complexity	O(t * log(psi)) per sample
Memory requirement	Low; each tree stores only split features and split values
Hyperparameters	Number of trees (t), subsample size (psi), contamination ratio
High-dimensional performance	Good; random feature selection avoids the curse of dimensionality

Isolation Forest is implemented in scikit-learn as IsolationForest. While originally designed for outlier detection (where the training set may contain anomalies), it can also be used for novelty detection by training on clean data and then applying the model to new observations.

Deep learning methods

Beyond autoencoders, VAEs, and GANs, several other deep learning architectures have been applied to novelty detection.

Deep one-class classification

Deep SVDD (Ruff et al., 2018) combines the idea of SVDD with deep neural networks. A neural network is trained to map normal data to a compact region around a fixed center point in the output space. The training objective minimizes the mean distance between the network's outputs and the center. At test time, points that map far from the center are classified as novel. This approach allows the boundary to be learned in a data-driven feature space rather than in a fixed kernel feature space.

Self-supervised methods

Self-supervised novelty detection creates pretext tasks (such as predicting image rotations, solving jigsaw puzzles, or contrastive learning) from unlabeled normal data. A model trained on these tasks learns representations that capture the structure of normal data. At test time, poor performance on the pretext task indicates that the input does not match the patterns seen during training. Self-supervised approaches have become increasingly popular because they leverage the power of modern representation learning without requiring labeled anomalies.

Transformer-based methods

Recent work has applied transformer architectures to novelty detection, particularly for sequential and time-series data. Transformers can model long-range dependencies in normal sequences and detect novel patterns that violate those dependencies. Attention mechanisms in transformers also provide interpretability by highlighting which parts of the input contributed most to a novelty decision.

Evaluation metrics

Evaluating novelty detection systems requires metrics that account for the typically imbalanced nature of the problem (novel samples are rare). The most commonly used metrics include the following.

Metric	Description	When to use
AUROC	Area under the ROC curve; measures discrimination ability across all thresholds	General-purpose evaluation; insensitive to class imbalance in ranking
AUPRC	Area under the precision-recall curve; focuses on performance for the positive (novel) class	When the positive class is very rare and false positives are costly
F1 score	Harmonic mean of precision and recall; requires a fixed threshold	When a single operating point must be chosen
Precision at k	Fraction of the top-k scored samples that are truly novel	When only the top-ranked samples will be inspected
False positive rate at fixed true positive rate	FPR when TPR is held at a specific level (e.g., 95%)	When a minimum detection rate is required

AUROC is the most widely reported metric in novelty detection research because it summarizes performance across all possible thresholds. However, AUPRC can be more informative when the novel class is extremely rare, because AUROC can appear high even when the model produces many false positives in absolute terms.

Applications

Novelty detection has been applied across a wide range of domains. The following table summarizes some of the most common application areas.

Domain	Application	Typical data	Common methods
Cybersecurity	Detecting novel intrusion attempts, zero-day attacks, and new malware variants	Network traffic logs, system call traces	One-Class SVM, Isolation Forest, autoencoders
Manufacturing	Identifying defects or failures not seen during quality control training	Sensor readings, images of products	Convolutional autoencoders, One-Class SVM
Medical monitoring	Detecting abnormal patient vitals or rare disease patterns	Electrocardiograms, medical images, patient records	GMMs, LOF, deep autoencoders
Fraud detection	Flagging new types of fraudulent transactions	Transaction records, user behavior logs	Isolation Forest, autoencoders, LOF
Autonomous vehicles	Recognizing unknown objects or scenarios on the road	Camera images, lidar point clouds	Deep one-class classification, self-supervised methods
Natural language processing	Detecting novel topics, events, or out-of-domain text	Text documents, social media posts	Transformer-based methods, probabilistic models
Robotics	Identifying unfamiliar objects or environments	Sensor data, camera feeds	Reconstruction-based methods, distance-based methods
Scientific discovery	Flagging unusual astronomical events or particle physics signals	Telescope observations, detector readouts	KDE, Isolation Forest, deep learning

Software and implementation

Several popular machine learning libraries provide implementations of novelty detection algorithms.

Library	Algorithms available	Language
scikit-learn	One-Class SVM, SGD One-Class SVM, Isolation Forest, LOF, Elliptic Envelope	Python
PyTorch	Custom autoencoders, VAEs, GANs, Deep SVDD (via libraries like PyOD)	Python
TensorFlow / Keras	Custom autoencoders, VAEs, GANs	Python
PyOD	Over 40 algorithms including ABOD, COPOD, ECOD, SOD, and deep learning methods	Python
MATLAB	Isolation Forest, One-Class SVM, LOF, robust covariance	MATLAB

PyOD (Python Outlier Detection) is a particularly comprehensive library that provides a unified API for dozens of novelty and outlier detection algorithms, including both classical statistical methods and deep learning approaches.

Challenges and open problems

Despite significant progress, novelty detection remains a challenging problem for several reasons.

Defining normality. In many real-world settings, the boundary between normal and novel is not sharp. What counts as normal can change over time (concept drift), and different stakeholders may disagree on where to draw the line.

High-dimensional data. Many novelty detection methods rely on distance or density computations that become unreliable in high-dimensional spaces due to the curse of dimensionality. Deep learning methods partially address this by learning lower-dimensional representations, but they introduce their own challenges around architecture selection and training stability.

Threshold selection. Most novelty detection algorithms produce a continuous score rather than a binary decision. Choosing the threshold that separates normal from novel is a practical challenge, especially when labeled novel samples are unavailable for calibration.

Contaminated training data. Novelty detection assumes clean training data, but in practice some anomalies may be present in the training set. Methods that are robust to small amounts of contamination are an active area of research.

Evaluation without labeled data. In deployment settings, ground-truth labels for novel events may not be available, making it difficult to evaluate and monitor model performance over time.

Interpretability. When a system flags an observation as novel, practitioners often need to understand why. Providing explanations for novelty scores is an active research topic, with approaches ranging from feature importance analysis to attention visualization in deep models.

History

The roots of novelty detection can be traced to classical statistical methods for outlier rejection, which date back to the 19th century. Early work focused on identifying extreme values in univariate distributions using techniques such as Grubbs' test and Dixon's Q test.

The modern formulation of novelty detection as a machine learning problem began to take shape in the late 1990s and early 2000s. Scholkopf et al. introduced the support vector method for novelty detection in 1999 at NeurIPS, and the full journal version describing the One-Class SVM appeared in Neural Computation in 2001. Around the same time, Tax and Duin developed Support Vector Data Description (SVDD) as a complementary approach. Breunig et al. introduced the Local Outlier Factor algorithm at ACM SIGMOD in 2000, providing a density-based alternative to distance-based methods.

Markou and Singh published a comprehensive two-part review of novelty detection in Signal Processing in 2003, covering both statistical approaches (Part 1) and neural network-based approaches (Part 2). These reviews helped establish novelty detection as a recognized subfield of machine learning.

Liu, Ting, and Zhou introduced the Isolation Forest algorithm at IEEE ICDM in 2008, offering a tree-based approach with linear time complexity and low memory requirements. The extended journal version appeared in ACM Transactions on Knowledge Discovery from Data in 2012.

Pimentel et al. published an updated review in Signal Processing in 2014, organizing the field into five categories (probabilistic, distance-based, domain-based, reconstruction-based, and information-theoretic) and surveying applications across multiple domains.

The rise of deep learning from 2015 onward brought new reconstruction-based and representation-learning approaches. Deep SVDD (Ruff et al., 2018), self-supervised novelty detection methods, and transformer-based approaches have since expanded the toolkit available to practitioners.

References

Scholkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., & Williamson, R. C. (2001). Estimating the support of a high-dimensional distribution. *Neural Computation*, 13(7), 1443-1471.
Tax, D. M. J., & Duin, R. P. W. (2004). Support Vector Data Description. *Machine Learning*, 54(1), 45-66.
Breunig, M. M., Kriegel, H.-P., Ng, R. T., & Sander, J. (2000). LOF: Identifying density-based local outliers. *Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data*, 93-104.
Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2008). Isolation Forest. *Proceedings of the 2008 IEEE International Conference on Data Mining (ICDM)*, 413-422.
Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-based anomaly detection. *ACM Transactions on Knowledge Discovery from Data*, 6(1), 1-39.
Markou, M., & Singh, S. (2003). Novelty detection: A review, Part 1: Statistical approaches. *Signal Processing*, 83(12), 2481-2497.
Markou, M., & Singh, S. (2003). Novelty detection: A review, Part 2: Neural network based approaches. *Signal Processing*, 83(12), 2499-2521.
Pimentel, M. A. F., Clifton, D. A., Clifton, L., & Tarassenko, L. (2014). A review of novelty detection. *Signal Processing*, 99, 215-249.
Ruff, L., Vandermeulen, R., Goernitz, N., Deecke, L., Siddiqui, S. A., Binder, A., Muller, E., & Kloft, M. (2018). Deep one-class classification. *Proceedings of the 35th International Conference on Machine Learning (ICML)*, 4393-4402.
Chalapathy, R., & Chawla, S. (2019). Deep learning for anomaly detection: A survey. *arXiv preprint arXiv:1901.03407*.
Scholkopf, B., Williamson, R. C., Smola, A. J., Shawe-Taylor, J., & Platt, J. C. (1999). Support vector method for novelty detection. *Advances in Neural Information Processing Systems 12 (NeurIPS)*, 582-588.
Pedregosa, F., et al. (2011). Scikit-learn: Machine learning in Python. *Journal of Machine Learning Research*, 12, 2825-2830.
Zhao, Y., Nasrullah, Z., & Li, Z. (2019). PyOD: A Python toolbox for scalable outlier detection. *Journal of Machine Learning Research*, 20(96), 1-7.
Salehi, M., Mirzaei, H., Hendrycks, D., Li, Y., Rohban, M. H., & Sabokrou, M. (2022). A unified survey on anomaly, novelty, open-set, and out-of-distribution detection: Solutions and future challenges. *Transactions on Machine Learning Research*.

Explain like I'm 5 (ELI5)

Novelty detection vs. anomaly detection vs. outlier detection

Relationship to open-set recognition and out-of-distribution detection

Categories of methods

Probabilistic methods

Gaussian mixture models

Kernel density estimation

Elliptic envelope

Distance-based methods

k-nearest neighbors

Local Outlier Factor

Domain-based methods

One-class SVM

Support Vector Data Description

Reconstruction-based methods

Autoencoders

Variational autoencoders

Generative adversarial networks

Isolation Forest

Deep learning methods

Deep one-class classification

Self-supervised methods

Transformer-based methods

Evaluation metrics

Applications

Software and implementation

Challenges and open problems

History

See also

References

Improve this article

Related Articles

ARC-AGI 2

Latent Dirichlet allocation

Outlier Detection

Outliers

DBSCAN

Dimension Reduction

Explain like I'm 5 (ELI5)

Novelty detection vs. anomaly detection vs. outlier detection

Relationship to open-set recognition and out-of-distribution detection

Categories of methods

Probabilistic methods

Gaussian mixture models

Kernel density estimation

Elliptic envelope

Distance-based methods

k-nearest neighbors

Local Outlier Factor

Domain-based methods

One-class SVM

Support Vector Data Description

Reconstruction-based methods

Autoencoders

Variational autoencoders

Generative adversarial networks

Isolation Forest

Deep learning methods

Deep one-class classification

Self-supervised methods

Transformer-based methods

Evaluation metrics

Applications

Software and implementation

Challenges and open problems

History

See also

References

Related Articles

ARC-AGI 2

Latent Dirichlet allocation

Outlier Detection

Outliers

DBSCAN

Dimension Reduction