Support for caching queries in CacheBackedEmbeddings #8149

YasharF · 2025-05-09T02:48:23Z

YasharF
May 9, 2025

Checked

I searched existing ideas and did not find a similar one
I added a very descriptive title
I've clearly described the feature request and motivation for it

Feature request

Adding query embedding caching

Currently CacheBackedEmbeddings does not do query caching with a code comment from two years ago "might make sense to hold off to see the most common patterns."

langchainjs/langchain/src/embeddings/cache_backed.ts

Line 81 in b7a9cac

* This method does not support caching at the moment.

I suspect there is not going to be a one size fits all, but perhaps we can address some use cases with an implementation similar to the caching of document embeddings.

Motivation

In applications that do one-shot text classification, some inputs tend to repeat themselves. For example, in a customer service router where a customer states what they need for the application to route them to the correct department, microservice, etc. There is a portion of the customer requests that are an exact match of each other. By providing the option to developers to utilize such a cache if applicable to their use case they can reduce their model and related calls and costs.

Proposal (If applicable)

The proposal is to use a similar implementation as the document embedding caching but make the query caching optional so it would be a non-breaking addition, and help anyone that can benefit from it. Similar to the document caching we are not going to have anything for purging the old entrees and going to leave it to the devs to add TTL, or a clean up process to their cache storage.

await client.connect();
const db = client.db();
const documentEmbeddingsCache = new MongoDBStore({ collection: db.collection(DOC_EMBEDDINGS_CACHE) });
const queryEmbeddingsCache = new MongoDBStore({ collection: db.collection(QUERY_EMBEDDINGS_CACHE) });

    const cacheBackedEmbeddings = CacheBackedEmbeddings.fromBytesStore(
      new HuggingFaceInferenceEmbeddings({
        apiKey: process.env.HUGGINGFACE_KEY,
        model: process.env.HUGGINGFACE_EMBEDDING_MODEL,
        provider: process.env.HUGGINGFACE_PROVIDER,
      }),
      documentEmbeddingsCache,
      {
        namespace: process.env.HUGGINGFACE_EMBEDDING_MODEL, 
        
          // New Optional parameter passing storage for caching
          // if provided would enable query caching and would be the storage for such a cache
        queryEmbeddingStore: queryEmbeddingsCache,
      },
    );

I actually have the query caching implemented in cjs and deployed with a patch-package patch to a prod application that I have up and running. I can repack it in TS, add tests and documentation and submit a PR for it in the next few weeks.

YasharF · 2025-05-28T02:56:42Z

YasharF
May 28, 2025
Author

PR: #8249

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for caching queries in CacheBackedEmbeddings #8149

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Support for caching queries in CacheBackedEmbeddings #8149

Uh oh!

Uh oh!

YasharF May 9, 2025

Checked

Feature request

Motivation

Proposal (If applicable)

Replies: 1 comment

Uh oh!

YasharF May 28, 2025 Author

YasharF
May 9, 2025

YasharF
May 28, 2025
Author