Skip to content

OpenAIEmbeddings does not allow setting encoding_format, causing incompatibility with LM Studio (returns float[], not base64) #8221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
5 tasks done
GregorBiswanger opened this issue May 21, 2025 · 1 comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@GregorBiswanger
Copy link

GregorBiswanger commented May 21, 2025

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

WORKS with LM Studio if encoding_format is set manually (value doesn't matter)

const openAiClient = new OpenAIClient(createLMStudioConfig());

async function getEmbeddings(text) {
  const response = await openAiClient.embeddings.create({
    model: 'none',
    input: text,
    encoding_format: 'base64',
  });

  return response.data[0].embedding;
}

DOES NOT WORK with OpenAIEmbeddings – no way to set encoding_format

  const embeddings = new OpenAIEmbeddings({ ...createLMStudioConfig(), model: '' });

  const vectorStore = await QdrantVectorStore.fromExistingCollection(embeddings, {
    url: 'http://localhost:6333',
    collectionName: 'eu-ai',
  });

Error Message and Stack Trace (if applicable)

[Running] node "c:\projects\rag-sample\index.mjs"
✅ Server running at http://localhost:11434
file:///c:/projects/rag-sample/node_modules/@qdrant/openapi-typescript-fetch/dist/esm/fetcher.js:169
throw new fun.Error(err);
^

ApiError: Bad Request
at Object.fun [as searchPoints] (file:///c:/projects/rag-sample/node_modules/@qdrant/openapi-typescript-fetch/dist/esm/fetcher.js:169:23)
at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
at async QdrantClient.search (file:///c:/projects/rag-sample/node_modules/@qdrant/js-client-rest/dist/esm/qdrant-client.js:165:26)
at async QdrantVectorStore.similaritySearchVectorWithScore (file:///c:/projects/rag-sample/node_modules/@langchain/qdrant/dist/vectorstores.js:165:25)
at async QdrantVectorStore.similaritySearch (file:///c:/projects/rag-sample/node_modules/@langchain/core/dist/vectorstores.js:260:25)
at async file:///c:/projects/rag-sample/index.mjs:55:29 {
headers: Headers {},
url: 'http://localhost:6333/collections/eu-ai/points/search',
status: 400,
statusText: 'Bad Request',
data: {
status: {
error: 'Wrong input: Vector dimension error: expected dim: 768, got 192'
},
time: 0.001095872
}
}

Node.js v22.14.0

[Done] exited with code=1 in 6.972 seconds

Description

I'm using OpenAIEmbeddings from LangChainJS in combination with LM Studio as the embedding backend. LM Studio provides a fully OpenAI-compatible API, but it always returns raw float[] embeddings, regardless of the encoding_format parameter.

Here’s the problem:

  • The LangChainJS OpenAIEmbeddings class does not expose any option to set encoding_format.
  • The underlying openai SDK (v4.x) automatically defaults to encoding_format: "base64" when this is not set.
  • As a result, LangChain expects base64-encoded embeddings, but LM Studio returns float[], leading to exceptions.

The only way to make this work is to explicitly set encoding_format in the SDK call and interestingly, it works no matter what value you set ("float" or "base64"), because LM Studio ignores the parameter anyway and always returns float arrays.

However, since OpenAIEmbeddings does not allow setting encoding_format, I had to extend the class with a custom wrapper just to inject this parameter.

System Info

@langchain/community: 0.3.43
@langchain/core: 0.3.56
@langchain/openai: 0.5.10
@langchain/qdrant: 0.1.2
Node: v22.14.0
Platform: Windows 11
Embedding backend: LM-Studio (localhost)

@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label May 21, 2025
@erbg
Copy link

erbg commented May 22, 2025

same here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants