Skip to main content

Leveraging Word2vec for More than Text

Recently, I read a few articles that are good examples of a trend where companies are leveraging word2vec-like models to learn embeddings, which are a compact vector representation of objects.

This post, the first in a series of three, will highlight how Best Buy, Capital One, Facebook, and Grubhub are using machine learning to learn embeddings in order to better understand their customers and to provide those customers personalized recommendations.

By better understanding customers and delivering relevant content to them through personalized recommendations, a better customer experience, and ultimately customer loyalty, can be achieved.

Learning Embeddings

Word2vec is an algorithm that uses the sequential nature of text to learn word embeddings. The embeddings are learned by leveraging the Distributional Hypothesis, which states that words that occur in the same contexts (the words around a particular word) tend to have similar meanings.

For a given word, the algorithm learns to predict a context by leveraging the statistics of the co-occurrence of words and their neighboring words in a sequence of words (a sentence). In doing so, it learns embeddings for words and places them in a vector space where words that are close to each other in that space are semantically similar. Figure 1 below is an example from Grubhub that shows menu items mapped to a three-dimensional embedding space.


Figure 1 Three-dimensional menu embedding example (Source Grubhub)


Leveraging Word2vec for Other Sequential Data Types

Text is just one example of sequential data that can be used to train embeddings. As we will see in examples from Best Buy, Capital One, Facebook, and Grubhub, you can leverage word2vec to create embeddings that capture semantic relationships by training a word2vec-like model on sequences of user actions, such as click sessions.

In the case of click sessions, the click sequence of a consumer is analogous to a sentence and the products in the click sequence are analogous to words.

Best Buy

Best Buy trains product embeddings using a word2-vec-like model in order to provide personalized recommendations for their customers. Their model treats products like words and a customer’s sequence of products in a session like a sentence.

In a blog post, Best Buy explains how they leverage this sequence of user activity to gain a semantic understanding of content — “as a user browses and interacts with different content, the abstract qualities of a piece of content can be inferred from what content the user interacts with before and after. This allows us to apply word vector models to learn embeddings for the products based on the assumption that shoppers often buy related items in sequence.”

Capital One

In a post, Capital One described how they can apply the idea of graph embeddings to financial services. They take random “walks” on graphs to generate sequences of nodes that can be viewed similarly to sentences. This allows them to train a model similar to word2vec in order to generate the graph embeddings. The embeddings they learn represent accounts and merchants.

They state that “accounts will be embedded near each other if and only if they tend to shop at the same kinds of merchants. And merchants will be embedded near each other if and only if they tend to receive customers who have similar shopping habits. By analyzing graphs of credit card transactions, we can use representation learning to understand how entities are similar based on how they interact.”


When users search and browse on Grubhub, they are directly and indirectly providing feedback on how items are related. Through this feedback, Grubhub can gain a semantic understanding of query intent to learn a latent food graph. In their case, they are embedding a search query instead of words.

Grubhub went into more detail about how this works — “if you search for the delicious French dish ‘Magret de Canard’ and convert on the restaurant Le Prive, and if someone else searches for the cuisine ‘French’ and clicks on Le Prive, then at scale there is strong collaborative feedback that Le Prive offers French cuisine and Margret de Canard is French cuisine.”

Grubhub uses their learned embeddings and food graph as part of their recommendation system.


Facebook developed what they call ig2vec in order to recommend relevant content to users on Instagram.

With ig2vec, they learn account embeddings in order to determine which accounts are topically similar to each other. They treat Instagram accounts “that a user interacts with — e.g., a person likes media from an account — as a sequence of words in a sentence.” This allows them to train a word2vec-like model that helps them find accounts that are similar to ones that a person previously showed interest in. This in turn helps them recommend content that is more likely to be relevant to a particular Instagram user.

Similarity Search

This blog post presented a few examples of how companies are leveraging word2vec-like models to learn semantic embeddings.

In part 2 of this blog series, we will describe how you can use semantic embeddings as part of a similarity search pipeline. Similarity search is a key component of recommendation engines, which help companies to deliver personalized content to their customers and in turn gain customer loyalty.