Skip to content

update to spring 3.3.1, add upload/download file functions, and add v .1 of python rag chatbot #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ MISSING
MISSING

## Notes/Issues
MISSING
Spring Boot 3.0 requires Java 17 as a minimum version.


## URLs
https://apexapps.oracle.com/pls/apex/r/dbpm/livelabs/view-workshop?wid=3874
Expand Down
55 changes: 8 additions & 47 deletions pom.xml
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>3.1.4</version>
<version>3.3.1</version>
<relativePath/>
</parent>
<groupId>oracleai</groupId>
Expand All @@ -15,26 +15,13 @@
<description>Oracle AI Demos</description>

<properties>
<spring-cloud.version>2021.0.5</spring-cloud.version>
<oracle.jdbc.version>21.7.0.0</oracle.jdbc.version>
<spring.boot.version>3.1.2</spring.boot.version>
<snakeyaml.version>1.31</snakeyaml.version>
<spring.vault.version>3.1.1</spring.vault.version>
<oci.sdk.version>3.29.0</oci.sdk.version>
<jib-maven-plugin.version>3.3.1</jib-maven-plugin.version>
<liquibase.version>4.17.2</liquibase.version>
<oci.sdk.version>3.44.2</oci.sdk.version>
</properties>

<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<exclusions>
<exclusion>
<groupId>org.yaml</groupId>
<artifactId>snakeyaml</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
Expand All @@ -45,11 +32,10 @@
<artifactId>json</artifactId>
<version>20231013</version>
</dependency>

<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-common</artifactId>
<version>${oci.sdk.version}</version>
<groupId>com.oracle.cloud.spring</groupId>
<artifactId>spring-cloud-oci-starter</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
Expand All @@ -59,7 +45,7 @@
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
<artifactId>oci-java-sdk-generativeaiinference</artifactId>
<version>3.32.1</version>
<version>${oci.sdk.version}</version>
</dependency>
<dependency>
<groupId>com.oracle.oci.sdk</groupId>
Expand Down Expand Up @@ -87,20 +73,6 @@
<artifactId>slf4j-simple</artifactId>
<version>2.0.6</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<version>${spring.boot.version}</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<version>${spring.boot.version}</version>
<scope>test</scope>
</dependency>



<dependency>
<groupId>javax.xml.bind</groupId>
<artifactId>jaxb-api</artifactId>
Expand All @@ -122,18 +94,7 @@
<artifactId>service</artifactId>
<version>0.12.0</version>
</dependency>
</dependencies>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-dependencies</artifactId>
<version>${spring-cloud.version}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
</dependencies>
<build>
<plugins>
<plugin>
Expand Down
52 changes: 52 additions & 0 deletions python-rag-chatbot/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Integrating Oracle Database 23ai RAG and OCI Generative AI with LangChain

[**Oracle Database 23ai**](https://www.oracle.com/database/free-1/)

[**OCI GenAI**](https://www.oracle.com/artificial-intelligence/generative-ai/large-language-models/)

[**LangChain**](https://www.langchain.com/)


## TODO instructions

- setup ~/.oci/config
- set yourcompartmentid
- podman run -d --name 23ai -p 1521:1521 -e ORACLE_PWD=Welcome12345 -v oracle-volume:/Users/pparkins/oradata container-registry.oracle.com/database/free:latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking this password isn't real - it's OK as long as we don't show any IP address or associated endpoint 👍🏻

- create/config vector tablespace and user
- add oracle database info for use in init_rag_streamlit.py / init_rag_streamlit_exp.py
- run run_oracle_bot.sh /run_oracle_bot_exp.sh


## Documentation
The development of the proposed integration is based on the example, from LangChain, provided [here](https://python.langchain.com/docs/modules/model_io/models/llms/custom_llm)

## Features
* How-to build a complete, end-2-end RAG solution using LangChain and Oracle GenAI Service.
* How-to load multiple pdf
* How-to split pdf pages in smaller chuncks
* How-to do semantic search using Embeddings
* How-to use Cohere Embeddings
* How-to use HF Embeddings
* How-to setup a Retriever using Embeddings
* How-to add Cohere reranker to the chain
* How to integrate OCI GenAI Service with LangChain
* How to define the LangChain
* How to use the Oracle vector Db capabilities
* How to use in-memory database capability

## Oracle BOT
Using the script [run_oracle_bot_exp.sh](run_oracle_bot_exp.sh) you can launch a simple ChatBot that showcase Oracle GenAI service. The demo is based on docs from Oracle Database pdf documentation.

You need to put in the local directory:
* Trobleshooting.pdf
* globally-distributed-autonomous-database.pdf
* Oracle True cache.pdf
* oracle-database-23c.pdf
* oracle-globally-distributed-database-guide.pdf
* sharding-adg-addshard-cookbook-3610618.pdf

You can add more pdf. Edit [config_rag.py](config_rag.py)




71 changes: 71 additions & 0 deletions python-rag-chatbot/config_rag.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# configurations for the RAG

# to enable debugging info..
DEBUG = False

# book to use for augmentation
# BOOK1 = "APISpec.pdf"
BOOK1 = "pdfFiles/sharding-adg-addshard-cookbook-3610618.pdf"
BOOK2 = "pdfFiles/globally-distributed-autonomous-database.pdf"
# BOOK4 = "OnBoardingGuide.pdf"
# BOOK5 = "CreateWorkFlow.pdf"
# BOOK6 = "Team Onboarding.pdf"
# BOOK7 = "workflow.pdf"
BOOK3 = "pdfFiles/oracle-database-23c.pdf"
BOOK4 = "pdfFiles/oracle-globally-distributed-database-guide.pdf"
BOOK5 = "pdfFiles/Oracle True cache.pdf"
BOOK6 = "pdfFiles/Trobleshooting.pdf"
# BOOK12 = "OsdCode.pdf"

BOOK_LIST = [BOOK1, BOOK2, BOOK3, BOOK4, BOOK5, BOOK6]


# to divide docs in chunks
CHUNK_SIZE = 1000
CHUNK_OVERLAP = 50


#
# Vector Store (Chrome or FAISS)
#
# VECTOR_STORE_NAME = "FAISS"
# VECTOR_STORE_NAME = "ORACLEDB"
VECTOR_STORE_NAME = "CHROME"


# type of Embedding Model. The choice has been parametrized
# Local means HF
EMBED_TYPE = "LOCAL"
# see: https://huggingface.co/spaces/mteb/leaderboard
# see also: https://github.com/FlagOpen/FlagEmbedding
# base seems to work better than small
# EMBED_HF_MODEL_NAME = "BAAI/bge-base-en-v1.5"
# EMBED_HF_MODEL_NAME = "BAAI/bge-small-en-v1.5"
EMBED_HF_MODEL_NAME = "BAAI/bge-large-en-v1.5"

# Cohere means the embed model from Cohere site API
# EMBED_TYPE = "COHERE"
EMBED_COHERE_MODEL_NAME = "embed-english-v3.0"

# number of docs to return from Retriever
MAX_DOCS_RETRIEVED = 6

# to add Cohere reranker to the QA chain
ADD_RERANKER = False

#
# LLM Config
#
# LLM_TYPE = "COHERE"
LLM_TYPE = "OCI"

# max tokens returned from LLM for single query
MAX_TOKENS = 1000
# to avoid "creativity"
TEMPERATURE = 0

#
# OCI GenAI configs
#
TIMEOUT = 30
ENDPOINT = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com"
1 change: 1 addition & 0 deletions python-rag-chatbot/copy.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
feedback_0
Loading