Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI. learn more
Meta, the parent company of Facebook, Instagram, WhatsApp, Threads, and more, operates one of the world’s largest recommendation systems.
In two recently published papers, researchers show how generative models can be used to better understand and respond to user intent.
By viewing recommendations as a generative problem, we can approach them in new ways that are richer and more efficient than traditional approaches. This approach can have important uses in applications that need to retrieve documents, products, or other types of objects.
Dense and generative searches
The standard approach to creating recommendation systems is to compute, store, and retrieve dense representations of documents. For example, to recommend items to a user, an application must train a model that can compute the embedding of the user’s request and the embedding of a large item store.
During inference, the recommender system attempts to understand the user’s intent by finding one or more items with similar embeddings of the user. This approach requires storing the embeddings of every item, and every recommendation operation requires comparing the user’s embeddings to the entire item store, so as the number of items increases, storage and computing power increases. will be required.
Rather than searching a database, generative search attempts to understand the user’s intent and make recommendations by simply predicting the next item in the set of things we know about the user’s interaction. It’s a new approach.
Here’s how it works:
The key to making generative search work is computing a “semantic ID” (SID) that contains contextual information about each item. Generative search systems like TIGER operate in two phases. First, an encoder model is trained to create a unique embedding value for each item based on its description and properties. These embedded values become SIDs and are stored with the item.

In the second stage, a transformer model is trained to predict the next SID in the input sequence. The list of input SIDs represents the user’s past interactions with items, and the model’s predictions are the SIDs of the items to recommend. Generative search reduces the need to store and retrieve embeddings for individual items. Therefore, as the list of items grows, its inference and storage costs remain constant. It also enhances the ability to capture deeper semantic relationships in your data and provides other benefits of generative models, such as changing the temperature to adjust the diversity of recommendations.
Advanced generative search
Despite the low storage and inference costs of generative search, it has some limitations. For example, they tend to overfit to items they see during training. This means I have trouble dealing with items added to the catalog after the model has been trained. In recommendation systems, this is often referred to as the “cold start problem” and involves users and items that are new and have no interaction history.
To address these shortcomings, Meta developed a hybrid recommendation system called LIGER. It combines the computational and storage efficiency of generative search with the robust embedding quality and ranking capabilities of dense search.
During training, LIGER uses both the similarity score and the next token goal to improve the model’s recommendations. During inference, LIGER selects some candidates based on a generation mechanism, supplements them with some cold start items, and ranks them based on the embeddings of the generated candidates.

“The fusion of dense and generative search techniques has tremendous potential to advance recommendation systems,” the researchers wrote, adding that as the models evolve, “they will become increasingly practical in real-world applications.” “This enables a more personalized and responsive user experience.”
In a separate paper, the researchers introduce a new multimodal generative search technique called Multimodal priority discerner (Mender). This is a technique that allows generative models to pick up implicit preferences from users’ interactions with various items. Mender is built on a generative search method based on SIDs and adds several components that allow you to enrich the recommendations according to your preferences.
Mender uses large-scale language models (LLMs) to translate user interactions into specific settings. For example, if a user praises or complains about a particular product in a review, the model summarizes it into preferences for that product category.

The main recommender model is trained to be conditioned on both the sequence of user interactions and the user’s preferences when predicting the next semantic ID in the input sequence. This gives the recommender model the ability to generalize and perform in-context learning and adapt to user preferences without being explicitly trained.
“Our contribution paves the way for a new class of generative search models that unlocks the ability to leverage organic data to drive recommendations via textual user preferences,” the researchers said. is writing.

Impact on enterprise applications
The efficiencies provided by generative search systems can have important implications for enterprise applications. These advances will soon lead to practical benefits, such as lower infrastructure costs and faster inference. The technology’s ability to keep storage and inference costs constant regardless of catalog size is especially valuable for growing businesses.
Its benefits extend to every industry, from e-commerce to enterprise search. Generative search is still in its infancy, and we expect to see applications and frameworks emerge as it matures.