Skip to content

πŸ” Chat with your PDFs using local LLMs (DeepSeek, Mistral) and get visually highlighted answers – all offline.

Notifications You must be signed in to change notification settings

DefaultSpace/smart-pdf-chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Role-based PDF QA Chatbot (LangChain + FAISS + Qwen2.5 via Ollama)

An intelligent chatbot powered by a local LLM that can role-play and answer questions based on PDF content.

Technologies Used

  • Python 3.10+
  • Streamlit – Interface
  • PyMuPDF (fitz) – PDF text extraction
  • LangChain – Chunking, Retriever, PromptTemplate
  • FAISS – Vector database
  • Ollama (Qwen2.5:latest) – Embedding and LLM

Project Folder Structure

pdf-chatbot/
β”‚
β”œβ”€β”€ app.py                  # Main application (Streamlit interface)
β”œβ”€β”€ pdf_handler.py          # Extracts text from PDF and chunks it
β”œβ”€β”€ embedder.py             # Embedding + FAISS database operations
β”œβ”€β”€ chatbot.py              # Response generation with Qwen2.5 (via Ollama)
β”œβ”€β”€ prompts.py              # Role-based prompt templates
β”œβ”€β”€ roles.json              # Defines the list of roles
β”œβ”€β”€ vectordb/               # FAISS files are stored here (created when the app runs)
β”œβ”€β”€ data/                   # User-uploaded PDFs (created when the app runs)
β”œβ”€β”€ requirements.txt        # Required libraries
└── README.md               # Project description

Installation Instructions

  1. Set up the Python environment:

    python -m venv venv
    source venv/bin/activate  # For Windows: venv\Scripts\activate
    pip install -r requirements.txt
  2. Ensure Ollama is running: Make sure the qwen2.5:latest model is running on Ollama. If not installed:

    ollama pull qwen2.5:latest

    The Ollama service needs to be running in the background (usually started with the ollama serve command or the Ollama Desktop application is running). It is not necessary to run the model separately with ollama run qwen2.5:latest while the application is running.

    To list installed models:

    ollama list

    You should see the qwen2.5:latest model (or a similar Qwen2.5 tag) in this list.

  3. Start the application: While in the project's main directory (pdf-chatbot/):

    streamlit run app.py

Project Niche and Added Value

Aspect Description
πŸ“Œ Niche An AI capable of role-playing and providing document-based consultancy.
🧠 LLM Usage Local model (Qwen2.5 via Ollama) providing AI privacy and offline operation advantages.
πŸ› οΈ RAG Arch. Data-driven approach using LangChain + FAISS, differing from classic chatbots.
πŸ’Ό CV Contrib. Combination of LangChain, FAISS, Ollama, PDF parsing, real-world use case.

About

πŸ” Chat with your PDFs using local LLMs (DeepSeek, Mistral) and get visually highlighted answers – all offline.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages