.Net: [MEVD] Allow a raw embedding property to reference a source data property to get generated from #11736
Labels
msft.ext.vectordata
Related to Microsoft.Extensions.VectorData
.NET
Issue or Pull requests regarding .NET code
In #10492, we're adding the ability to map arbitrary properties to a data store vector property, via an IEmbeddingGenerator:
This hides the raw embedding, which was a primary goal of the design (the user shouldn't need to deal with or be aware of
ReadOnlyMemory<float>
). However, it notably means that users cannot fetch back the embedding from the database. We generally agree that this is a niche scenario (it's quite rare for the embedding to actually be useful - possibly for further custom filtering in .NET), and vector databases indeed don't by default return vectors when searching, but we still those niche scenarios to be supported.Today, users can do this by handling embedding generation themselves:
This means that they need to use IEmbeddingGenerator themselves outside of MEVD, and call SearchEmbeddingAsync instead of SearchAsync to pass in the raw embedding.
We could make this better by allowing a raw embedding property to reference a source data property:
When the proposed
SourceProperty
parameter is set, DescriptionEmbedding is treated like an embedding-generated property, just like today: any default IEmbeddingGenerator is picked up and used when upserting new records. When returning records, both properties are populated from the database as usual, and the user can access the embeddings.Note that
SourceProperty
refers to a .NET property, and not to a database property. This means that it can refer to a non-persisted property that e.g. concatenates multiple other .NET properties.AFAICT there's nothing blocking us from adding this in the future (no breaking change).
The text was updated successfully, but these errors were encountered: