Search results

Earth mover's distance (EMD)
The '''Earth Mover's Distance''' (EMD), also known as the '''Wasserstein distance''' or '''Mallows distance''', is a measure of dissi ...duct of the mass being moved and the distance it is moved. Mathematically, the EMD between two probability distributions P and Q, defined over a common me

3 KB (518 words) - 01:15, 20 March 2023
Summary
...g a large dataset or model into a simplified representation, which retains the most essential information. This can be done through various methods, such ...while discarding less informative features. Reducing the dimensionality of the input data can help prevent overfitting, speed up training, and simplify vi

4 KB (504 words) - 22:27, 21 March 2023
Wasserstein loss
...(EMD), is a metric used in the field of machine learning, particularly in the training of Generative Adversarial Networks (GANs). Introduced by [[Martin ...training procedure involves minimizing a specific loss function, known as the Jensen-Shannon (JS) divergence. However, this loss function can lead to sev

4 KB (557 words) - 22:25, 21 March 2023
PR AUC (area under the PR curve)
...PR AUC is particularly useful when dealing with imbalanced datasets, where the proportion of positive and negative samples is unequal. ...ecision measures the accuracy of positive predictions, and recall measures the ability to identify all relevant instances.

3 KB (446 words) - 01:07, 21 March 2023
Mini-batch
...the [[loss]] on the mini-batch of examples is also far more efficient than the entire dataset. ...ent]] to train the model on all data in one iteration. Unfortunately, when the dataset grows large, this approach becomes computationally expensive and ma

5 KB (773 words) - 20:54, 17 March 2023
Artificial intelligence terms
...new text generated (output) by the [[LLM]] after you [[inference]] it with the [[prompt]] '''[[The Pile]]'''

2 KB (300 words) - 21:33, 11 January 2024
To-Do List
...]]''' - steer AGI towards the goals of humans. Prevent AGI from destroying the world - https://www.reddit.com/r/ControlProblem/wiki/faq ...lip/ - encode images and text into representations that can be compared in the same space. basis for many [[Text-to-Image Models]] like [[Stable Diffusion

4 KB (550 words) - 09:53, 14 May 2023
Linear
...he relationship between the input features ([[independent variables]]) and the output ([[dependent variable]]) can be represented by a straight line, or m ...hyperplane) that describes the relationship between the input features and the output variable.

3 KB (530 words) - 13:18, 18 March 2023
Agglomerative clustering
...this agglomerative cluster starts as a singleton cluster and at each step the two closest clusters are merged until an ending criterion is met. ...a point is assigned its own cluster; then the algorithm iteratively merges the two closest clusters until an ending criterion is met.

7 KB (1,108 words) - 20:48, 17 March 2023
Federated learning
...aw data itself, with a central server. In this article, we will delve into the concepts, techniques, and applications of federated learning. ...with a central server. This approach ensures that the raw data remains on the clients' devices, thus preserving data privacy and security.

3 KB (491 words) - 01:17, 20 March 2023
Sparse feature
...raph data. Utilizing sparse features effectively can significantly improve the efficiency and performance of machine learning algorithms by reducing memor ...in the COO list contains a tuple with the index of the non-zero value and the value itself. This format is memory-efficient and allows for easy manipulat

4 KB (550 words) - 13:28, 18 March 2023
Mini-batch stochastic gradient descent
...ic gradient descent]] (SGD), which itself is a stochastic approximation of the [[gradient descent]] optimization algorithm. The mini-batch stochastic gradient descent algorithm can be summarized in the following steps:

4 KB (537 words) - 11:43, 20 March 2023
Semi-supervised learning
...ten requires domain expertise. Semi-supervised learning takes advantage of the available unlabeled data to improve model performance without requiring a s ...a points. The model is then retrained on the combined labeled dataset, and the process is repeated iteratively until a desired level of performance is ach

4 KB (603 words) - 22:26, 21 March 2023

Search in namespaces: