Chat with your PDF 📄

This is an advanced web application that allows you to have an intelligent conversation with your PDF documents. Upload a PDF, and the application will process it, enabling you to ask questions, get summaries, and receive insightful answers based on the document's content. This project leverages the power of large language models (LLMs) and vector databases to create a seamless and interactive experience.

🚀 Live Demo

You can try out the live application here: https://pdf-bot-akshat.streamlit.app/

🌟 Features

Interactive Chat Interface: A clean and user-friendly interface built with Streamlit for uploading PDFs and engaging in a conversation.
Conversational Memory: The chatbot remembers the context of the conversation, allowing for follow-up questions and a more natural dialogue.
High-Quality Answers: Leverages state-of-the-art language models from Google's Generative AI suite to provide accurate and relevant answers.
Fast and Efficient: Utilizes parallel processing (multi-threading) to quickly prepare your PDF for questioning, ensuring a smooth user experience.
Advanced Text Refinement: Each part of the document is refined by an AI model to improve clarity and context before being used for answering questions.

⚙️ How It Works

The application follows a sophisticated process to enable a conversation with your PDF:

PDF Loading: The uploaded PDF is loaded and its text content is extracted using PyMuPDF.
Text Chunking: The extracted text is split into smaller, manageable chunks using a RecursiveCharacterTextSplitter. This is crucial for fitting the context into the language model's limits.
Chunk Refinement: Each text chunk is individually refined by a language model (gemini-pro) to improve its clarity and coherence. This step is parallelized for maximum speed.
Embedding and Indexing: The refined chunks are converted into vector embeddings using GoogleGenerativeAIEmbeddings and stored in a FAISS vector store for efficient similarity searches.
Conversational Chain: A ConversationalRetrievalChain is created, which uses the vector store as a retriever and a powerful language model as the question-answering engine. This chain is what enables the conversational memory.
User Interaction: The Streamlit interface captures the user's questions, passes them to the conversational chain along with the chat history, and displays the model's response.

🚀 Getting Started

Follow these instructions to set up and run the project on your local machine.

Prerequisites

Python 3.7 or higher
A Google API key. You can obtain one from the Google AI Studio.

Installation

Clone the repository:

git clone https://github.com/akshat2635/PDF-Bot.git
cd PDF-Bot

Install the dependencies: It is recommended to create a virtual environment first:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Then, install the required packages:

pip install -r requirements.txt

Configuration

Create a file named .env in the root of your project and add your Google API key. This file is loaded at runtime to configure the application.

# .env
GOOGLE_API_KEY="YOUR_GOOGLE_API_KEY"

Usage

To run the application, execute the following command in your terminal:

streamlit run app.py

This will start the Streamlit server, and you can access the application in your web browser, typically at http://localhost:8501.

🛠️ Technologies Used

Backend: Python
Web Framework: Streamlit
LLM Orchestration: LangChain
Language Models: Google Generative AI (Gemini 2.0 Flash)
Vector Store: FAISS (Facebook AI Similarity Search)
PDF Processing: PyMuPDF

🤝 Contributing

Contributions are welcome! If you have ideas for new features, improvements, or bug fixes, please feel free to:

Fork the repository.
Create a new branch (git checkout -b feature/YourFeature).
Make your changes.
Commit your changes (git commit -m 'Add some feature').
Push to the branch (git push origin feature/YourFeature).
Open a pull request.

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Chat with your PDF 📄

🚀 Live Demo

🌟 Features

⚙️ How It Works

🚀 Getting Started

Prerequisites

Installation

Configuration

Usage

🛠️ Technologies Used

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Languages

akshat2635/PDF-Bot

Folders and files

Latest commit

History

Repository files navigation

Chat with your PDF 📄

🚀 Live Demo

🌟 Features

⚙️ How It Works

🚀 Getting Started

Prerequisites

Installation

Configuration

Usage

🛠️ Technologies Used

🤝 Contributing

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages