Role-based PDF QA Chatbot (LangChain + FAISS + Qwen2.5 via Ollama)

An intelligent chatbot powered by a local LLM that can role-play and answer questions based on PDF content.

Technologies Used

Python 3.10+
Streamlit – Interface
PyMuPDF (fitz) – PDF text extraction
LangChain – Chunking, Retriever, PromptTemplate
FAISS – Vector database
Ollama (Qwen2.5:latest) – Embedding and LLM

Project Folder Structure

pdf-chatbot/
│
├── app.py                  # Main application (Streamlit interface)
├── pdf_handler.py          # Extracts text from PDF and chunks it
├── embedder.py             # Embedding + FAISS database operations
├── chatbot.py              # Response generation with Qwen2.5 (via Ollama)
├── prompts.py              # Role-based prompt templates
├── roles.json              # Defines the list of roles
├── vectordb/               # FAISS files are stored here (created when the app runs)
├── data/                   # User-uploaded PDFs (created when the app runs)
├── requirements.txt        # Required libraries
└── README.md               # Project description

Installation Instructions

Set up the Python environment:

python -m venv venv
source venv/bin/activate  # For Windows: venv\Scripts\activate
pip install -r requirements.txt

Ensure Ollama is running: Make sure the qwen2.5:latest model is running on Ollama. If not installed:
```
ollama pull qwen2.5:latest
```
The Ollama service needs to be running in the background (usually started with the ollama serve command or the Ollama Desktop application is running). It is not necessary to run the model separately with ollama run qwen2.5:latest while the application is running.

To list installed models:
```
ollama list
```
You should see the qwen2.5:latest model (or a similar Qwen2.5 tag) in this list.
Start the application: While in the project's main directory (pdf-chatbot/):
```
streamlit run app.py
```

Project Niche and Added Value

Aspect	Description
📌 Niche	An AI capable of role-playing and providing document-based consultancy.
🧠 LLM Usage	Local model (Qwen2.5 via Ollama) providing AI privacy and offline operation advantages.
🛠️ RAG Arch.	Data-driven approach using LangChain + FAISS, differing from classic chatbots.
💼 CV Contrib.	Combination of LangChain, FAISS, Ollama, PDF parsing, real-world use case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Role-based PDF QA Chatbot (LangChain + FAISS + Qwen2.5 via Ollama)

Technologies Used

Project Folder Structure

Installation Instructions

Project Niche and Added Value

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py
chatbot.py		chatbot.py
embedder.py		embedder.py
pdf_handler.py		pdf_handler.py
prompts.py		prompts.py
requirements.txt		requirements.txt
roles.json		roles.json

DefaultSpace/smart-pdf-chat

Folders and files

Latest commit

History

Repository files navigation

Role-based PDF QA Chatbot (LangChain + FAISS + Qwen2.5 via Ollama)

Technologies Used

Project Folder Structure

Installation Instructions

Project Niche and Added Value

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages