-
Notifications
You must be signed in to change notification settings - Fork 5.9k
在ensemble混合检索时,如何在fiass检索器中加入返回语义相似度的逻辑 #5315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hello @ArronAI007! I'm Dosu, here to help you solve bugs, answer questions, and guide you in becoming a contributor. While you wait for a human maintainer, feel free to ask me anything! 要在ensemble混合检索中将FAISS检索器加入返回语义相似度的逻辑,可以使用 from functools import lru_cache
@lru_cache(maxsize=10)
def create_bm25_retriever(vectorstore: VectorStore):
docs = list(vectorstore.docstore._dict.values())
import jieba
bm25_retriever = BM25Retriever.from_documents(
docs,
preprocess_func=jieba.lcut_for_search,
)
return bm25_retriever
@staticmethod
def from_vectorstore(
vectorstore: VectorStore,
top_k: int,
score_threshold: int | float,
):
faiss_retriever = vectorstore.as_retriever(
search_type="similarity_score_threshold",
search_kwargs={"score_threshold": score_threshold, "k": top_k},
)
bm25_retriever = create_bm25_retriever(vectorstore)
bm25_retriever.k = top_k
ensemble_retriever = EnsembleRetriever(
retrievers=[bm25_retriever, faiss_retriever], weights=[0.5, 0.5]
)
return EnsembleRetrieverService(retriever=ensemble_retriever, top_k=top_k) 在这个实现中, To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other |
如何输出根据query检索得到的相似度具体指标呢 |
非常感谢,不过你只是把源码描述了一遍,没有解答问题,请问如何在> def from_vectorstore(
|
def do_search(
self,
query: str,
top_k: int,
score_threshold: float = Settings.kb_settings.SCORE_THRESHOLD,
) -> List[Tuple[Document, float]]:
with self.load_vector_store().acquire() as vs:
retriever = get_Retriever("ensemble").from_vectorstore(
vs,
top_k=top_k,
score_threshold=score_threshold,
)
docs = retriever.get_relevant_documents(query)
return docs
The text was updated successfully, but these errors were encountered: