370
edits
No edit summary |
No edit summary |
||
Line 3: | Line 3: | ||
==Understanding Vector Embeddings== | ==Understanding Vector Embeddings== | ||
In the context of text data, words with similar meanings, such as "cat" and "kitty", must be represented | In the context of text data, words with similar meanings, such as "cat" and "kitty", must be represented to capture their semantic similarity. Vector representations achieve this by transforming data objects into arrays of real numbers with a fixed length, typically ranging from hundreds to thousands of elements. These arrays are generated by machine learning models through a process called vectorization. | ||
For instance, the words "cat" and "kitty" may be vectorized as follows: | For instance, the words "cat" and "kitty" may be vectorized as follows: | ||
Line 9: | Line 9: | ||
<code> | <code> | ||
cat = [1.5, -0.4, 7.2, 19.6, 3.1, ..., 20.2] | cat = [1.5, -0.4, 7.2, 19.6, 3.1, ..., 20.2] | ||
kitty = [1.5, -0.4, 7.2, 19.5, 3.2, ..., 20.8] | kitty = [1.5, -0.4, 7.2, 19.5, 3.2, ..., 20.8] | ||
</code> | </code> |
edits