generated from oracle-devrel/repo-template
-
Notifications
You must be signed in to change notification settings - Fork 13
update to spring 3.3.1, add upload/download file functions, and add v .1 of python rag chatbot #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
dac02c2
Adding the initial commit for the full stack rag chatbot
ptptiwari ca3824b
More changes for read me file
ptptiwari 3c573a4
More changes for read me file
ptptiwari 91dd16e
More changes for read me file
ptptiwari fa70cfe
Merge pull request #1 from ptptiwari/main
paulparkinson ed72c70
update to spring 3.3.1 and add upload/download file functions
paulparkinson bf31a2e
Merge remote-tracking branch 'origin/main'
paulparkinson d9f7e57
python rag chatbot with Oracle Database 23ai v .1
paulparkinson 29c840a
python rag chatbot with Oracle Database 23ai v .1
paulparkinson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# Integrating Oracle Database 23ai RAG and OCI Generative AI with LangChain | ||
|
||
[**Oracle Database 23ai**](https://www.oracle.com/database/free-1/) | ||
|
||
[**OCI GenAI**](https://www.oracle.com/artificial-intelligence/generative-ai/large-language-models/) | ||
|
||
[**LangChain**](https://www.langchain.com/) | ||
|
||
|
||
## TODO instructions | ||
|
||
- setup ~/.oci/config | ||
- set yourcompartmentid | ||
- podman run -d --name 23ai -p 1521:1521 -e ORACLE_PWD=Welcome12345 -v oracle-volume:/Users/pparkins/oradata container-registry.oracle.com/database/free:latest | ||
- create/config vector tablespace and user | ||
- add oracle database info for use in init_rag_streamlit.py / init_rag_streamlit_exp.py | ||
- run run_oracle_bot.sh /run_oracle_bot_exp.sh | ||
|
||
|
||
## Documentation | ||
The development of the proposed integration is based on the example, from LangChain, provided [here](https://python.langchain.com/docs/modules/model_io/models/llms/custom_llm) | ||
|
||
## Features | ||
* How-to build a complete, end-2-end RAG solution using LangChain and Oracle GenAI Service. | ||
* How-to load multiple pdf | ||
* How-to split pdf pages in smaller chuncks | ||
* How-to do semantic search using Embeddings | ||
* How-to use Cohere Embeddings | ||
* How-to use HF Embeddings | ||
* How-to setup a Retriever using Embeddings | ||
* How-to add Cohere reranker to the chain | ||
* How to integrate OCI GenAI Service with LangChain | ||
* How to define the LangChain | ||
* How to use the Oracle vector Db capabilities | ||
* How to use in-memory database capability | ||
|
||
## Oracle BOT | ||
Using the script [run_oracle_bot_exp.sh](run_oracle_bot_exp.sh) you can launch a simple ChatBot that showcase Oracle GenAI service. The demo is based on docs from Oracle Database pdf documentation. | ||
|
||
You need to put in the local directory: | ||
* Trobleshooting.pdf | ||
* globally-distributed-autonomous-database.pdf | ||
* Oracle True cache.pdf | ||
* oracle-database-23c.pdf | ||
* oracle-globally-distributed-database-guide.pdf | ||
* sharding-adg-addshard-cookbook-3610618.pdf | ||
|
||
You can add more pdf. Edit [config_rag.py](config_rag.py) | ||
|
||
|
||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
# configurations for the RAG | ||
|
||
# to enable debugging info.. | ||
DEBUG = False | ||
|
||
# book to use for augmentation | ||
# BOOK1 = "APISpec.pdf" | ||
BOOK1 = "pdfFiles/sharding-adg-addshard-cookbook-3610618.pdf" | ||
BOOK2 = "pdfFiles/globally-distributed-autonomous-database.pdf" | ||
# BOOK4 = "OnBoardingGuide.pdf" | ||
# BOOK5 = "CreateWorkFlow.pdf" | ||
# BOOK6 = "Team Onboarding.pdf" | ||
# BOOK7 = "workflow.pdf" | ||
BOOK3 = "pdfFiles/oracle-database-23c.pdf" | ||
BOOK4 = "pdfFiles/oracle-globally-distributed-database-guide.pdf" | ||
BOOK5 = "pdfFiles/Oracle True cache.pdf" | ||
BOOK6 = "pdfFiles/Trobleshooting.pdf" | ||
# BOOK12 = "OsdCode.pdf" | ||
|
||
BOOK_LIST = [BOOK1, BOOK2, BOOK3, BOOK4, BOOK5, BOOK6] | ||
|
||
|
||
# to divide docs in chunks | ||
CHUNK_SIZE = 1000 | ||
CHUNK_OVERLAP = 50 | ||
|
||
|
||
# | ||
# Vector Store (Chrome or FAISS) | ||
# | ||
# VECTOR_STORE_NAME = "FAISS" | ||
# VECTOR_STORE_NAME = "ORACLEDB" | ||
VECTOR_STORE_NAME = "CHROME" | ||
|
||
|
||
# type of Embedding Model. The choice has been parametrized | ||
# Local means HF | ||
EMBED_TYPE = "LOCAL" | ||
# see: https://huggingface.co/spaces/mteb/leaderboard | ||
# see also: https://github.com/FlagOpen/FlagEmbedding | ||
# base seems to work better than small | ||
# EMBED_HF_MODEL_NAME = "BAAI/bge-base-en-v1.5" | ||
# EMBED_HF_MODEL_NAME = "BAAI/bge-small-en-v1.5" | ||
EMBED_HF_MODEL_NAME = "BAAI/bge-large-en-v1.5" | ||
|
||
# Cohere means the embed model from Cohere site API | ||
# EMBED_TYPE = "COHERE" | ||
EMBED_COHERE_MODEL_NAME = "embed-english-v3.0" | ||
|
||
# number of docs to return from Retriever | ||
MAX_DOCS_RETRIEVED = 6 | ||
|
||
# to add Cohere reranker to the QA chain | ||
ADD_RERANKER = False | ||
|
||
# | ||
# LLM Config | ||
# | ||
# LLM_TYPE = "COHERE" | ||
LLM_TYPE = "OCI" | ||
|
||
# max tokens returned from LLM for single query | ||
MAX_TOKENS = 1000 | ||
# to avoid "creativity" | ||
TEMPERATURE = 0 | ||
|
||
# | ||
# OCI GenAI configs | ||
# | ||
TIMEOUT = 30 | ||
ENDPOINT = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
feedback_0 |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking this password isn't real - it's OK as long as we don't show any IP address or associated endpoint 👍🏻