Vector embeddings: Difference between revisions

no edit summary
No edit summary
Line 25: Line 25:
This result suggests that the difference between "king" and "man" represents some sort of "royalty", which is analogously applicable to "queen" minus "woman". Various concepts, such as "woman", "girl", "boy", etc., can be vectorized into arrays of numbers, often referred to as dimensions. These arrays can be visualized and correlated to familiar words, giving insight into their meaning.
This result suggests that the difference between "king" and "man" represents some sort of "royalty", which is analogously applicable to "queen" minus "woman". Various concepts, such as "woman", "girl", "boy", etc., can be vectorized into arrays of numbers, often referred to as dimensions. These arrays can be visualized and correlated to familiar words, giving insight into their meaning.


===Another Example===
===Example of 5 Words===
Objects (data): words such as kitty, puppy, orange, blueberry, structure, motorbike


Search term: fruit
A basic set of vector embeddings (limited to 5 dimensions) for the objects and the search term might appear as follows:
Word Vector embedding
kitty [1.5, -0.4, 7.2, 19.6, 20.2]
puppy [1.7, -0.3, 6.9, 19.1, 21.1]
orange [-5.2, 3.1, 0.2, 8.1, 3.5]
blueberry [-4.9, 3.6, 0.9, 7.8, 3.6]
strcuture [60.1, -60.3, 10, -12.3, 9.2]
motorbike [81.6, -72.1, 16, -20.2, 102]
Q: fruit [-5.1, 2.9, 0.8, 7.9, 3.1]
Upon examining each of the 5 components of the vectors, it's evident that kitty and puppy are much closer than puppy and orange (we don't even have to determine the distances). Similarly, fruit is significantly closer to orange and blueberry compared to the other words, making them the top results for the "fruit" search.
The true intrigue lies in the origin of these numbers, and this is where the remarkable advancements in contemporary deep learning have made a significant impact.


===More Than Just Words===
===More Than Just Words===
370

edits