370
edits
No edit summary Tag: Manual revert |
No edit summary |
||
Line 1: | Line 1: | ||
==Introduction== | ==Introduction== | ||
Vector embeddings | [[Vector embeddings]] are lists of numbers used to represent complex data like [[text]], [[images]], or [[audio]] in a numerical format enabling [[machine learning algorithms]] to process them. These embeddings translate [[semantic similarity]] between objects into proximity within a [[vector space]], making them suitable for tasks such as [[clustering]], [[recommendation]], and [[classification]]. [[Clustering algorithms]] group similar points together, [[recommendation systems]] find similar objects, and [[classification tasks]] determine the label of an object based on its most similar counterparts. | ||
==Creating Vector Embeddings== | ==Creating Vector Embeddings== | ||
===Feature Engineering=== | ===Feature Engineering=== | ||
One method for creating vector embeddings involves engineering the vector values using domain knowledge, a process known as feature engineering. For instance, in medical imaging, domain expertise is employed to quantify features such as shape, color, and regions within an image to capture semantics. However, feature engineering requires domain knowledge and is often too costly to scale. | One method for creating vector embeddings involves engineering the vector values using [[domain knowledge]], a process known as [[feature engineering]]. For instance, in medical imaging, domain expertise is employed to quantify features such as shape, color, and regions within an image to capture semantics. However, feature engineering requires domain knowledge and is often too costly to scale. | ||
===Deep Neural Networks=== | ===Deep Neural Networks=== | ||
Rather than engineering vector embeddings, models are frequently trained to translate objects into vectors. Deep neural | Rather than engineering vector embeddings, [[models]] are frequently trained to translate objects into vectors. [[Deep neural network]]s are commonly used for training such models. The resulting embeddings are typically [[high-dimensional]] (up to two thousand dimensions) and [[dense]] (all values are non-zero). Text data can be transformed into vector embeddings using models such as [[Word2Vec]], [[GLoVE]], and [[BERT]]. Images can be embedded using [[convolutional neural network]]s ([[CNN]]s) like [[VGG]] and [[Inception]], while audio recordings can be converted into vectors using [[image embedding transformation]]s over their visual representations, such as [[spectrogram]]s. | ||
==Example: Image Embedding with a Convolutional Neural Network== | ==Example: Image Embedding with a Convolutional Neural Network== |
edits