Skip to content

Commit ccd4cf7

Browse files
authored
Merge pull request #199 from Madhuvod/local-rag-qwen
Added new Demo: Local RAG Agent with Qwen 3 and Gemma 3 Models
2 parents f9f0074 + cf7c4ee commit ccd4cf7

File tree

3 files changed

+666
-0
lines changed

3 files changed

+666
-0
lines changed
+113
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# 🐋 Qwen 3 Local RAG Reasoning Agent
2+
3+
This RAG Application demonstrates how to build a powerful Retrieval-Augmented Generation (RAG) system using locally running Qwen 3 and Gemma 3 models via Ollama. It combines document processing, vector search, and web search capabilities to provide accurate, context-aware responses to user queries.
4+
5+
## Features
6+
7+
- **🧠 Multiple Local LLM Options**:
8+
9+
- Qwen3 (1.7b, 8b) - Alibaba's latest language models
10+
- Gemma3 (1b, 4b) - Google's efficient language models with multimodal capabilities
11+
- DeepSeek (1.5b) - Alternative model option
12+
- **📚 Comprehensive RAG System**:
13+
14+
- Upload and process PDF documents
15+
- Extract content from web URLs
16+
- Intelligent chunking and embedding
17+
- Similarity search with adjustable threshold
18+
- **🌐 Web Search Integration**:
19+
20+
- Fallback to web search when document knowledge is insufficient
21+
- Configurable domain filtering
22+
- Source attribution in responses
23+
- **🔄 Flexible Operation Modes**:
24+
25+
- Toggle between RAG and direct LLM interaction
26+
- Force web search when needed
27+
- Adjust similarity thresholds for document retrieval
28+
- **💾 Vector Database Integration**:
29+
30+
- Qdrant vector database for efficient similarity search
31+
- Persistent storage of document embeddings
32+
33+
## How to Get Started
34+
35+
### Prerequisites
36+
37+
- [Ollama](https://ollama.ai/) installed locally
38+
- Python 3.8+
39+
- Qdrant account (free tier available) for vector storage
40+
- Exa API key (optional, for web search capability)
41+
42+
### Installation
43+
44+
1. Clone the GitHub repository
45+
46+
```bash
47+
git clone https://github.com/Shubhamsaboo/awesome-llm-apps.git
48+
cd rag_tutorials/qwen_local_rag
49+
```
50+
51+
2. Install the required dependencies:
52+
53+
```bash
54+
pip install -r requirements.txt
55+
```
56+
57+
3. Pull the required models using Ollama:
58+
59+
```bash
60+
ollama pull qwen3:1.7b # Or any other model you want to use
61+
ollama run snowflake-arctic-embed # Or any other model you want to use
62+
```
63+
64+
4. Get your API keys:
65+
66+
- Qdrant API key and URL (for vector database)
67+
- Exa API key (optional, for web search)
68+
5. Run the application:
69+
70+
```bash
71+
streamlit run qwen_local_rag_agent.py
72+
```
73+
74+
## How It Works
75+
76+
1. **Document Processing**:
77+
78+
- PDF files are processed using PyPDFLoader
79+
- Web content is extracted using WebBaseLoader
80+
- Documents are split into chunks with RecursiveCharacterTextSplitter
81+
2. **Vector Database**:
82+
83+
- Document chunks are embedded using Ollama's embedding models
84+
- Embeddings are stored in Qdrant vector database
85+
- Similarity search retrieves relevant documents based on query
86+
3. **Query Processing**:
87+
88+
- User queries are analyzed to determine the best information source
89+
- System checks document relevance using similarity threshold
90+
- Falls back to web search if no relevant documents are found
91+
4. **Response Generation**:
92+
93+
- Local LLM (Qwen/Gemma) generates responses based on retrieved context
94+
- Sources are cited and displayed to the user
95+
- Web search results are clearly indicated when used
96+
97+
## Configuration Options
98+
99+
- **Model Selection**: Choose between different Qwen, Gemma, and DeepSeek models
100+
- **RAG Mode**: Toggle between RAG-enabled and direct LLM interaction
101+
- **Search Tuning**: Adjust similarity threshold for document retrieval
102+
- **Web Search**: Enable/disable web search fallback and configure domain filtering
103+
104+
## Use Cases
105+
106+
- **Document Q&A**: Ask questions about your uploaded documents
107+
- **Research Assistant**: Combine document knowledge with web search
108+
- **Local Privacy**: Process sensitive documents without sending data to external APIs
109+
- **Offline Operation**: Run advanced AI capabilities with limited or no internet access
110+
111+
## Requirements
112+
113+
See `requirements.txt` for the complete list of dependencies.

0 commit comments

Comments
 (0)