This is an advanced web application that allows you to have an intelligent conversation with your PDF documents. Upload a PDF, and the application will process it, enabling you to ask questions, get summaries, and receive insightful answers based on the document's content. This project leverages the power of large language models (LLMs) and vector databases to create a seamless and interactive experience.
You can try out the live application here: https://pdf-bot-akshat.streamlit.app/
- Interactive Chat Interface: A clean and user-friendly interface built with Streamlit for uploading PDFs and engaging in a conversation.
- Conversational Memory: The chatbot remembers the context of the conversation, allowing for follow-up questions and a more natural dialogue.
- High-Quality Answers: Leverages state-of-the-art language models from Google's Generative AI suite to provide accurate and relevant answers.
- Fast and Efficient: Utilizes parallel processing (multi-threading) to quickly prepare your PDF for questioning, ensuring a smooth user experience.
- Advanced Text Refinement: Each part of the document is refined by an AI model to improve clarity and context before being used for answering questions.
The application follows a sophisticated process to enable a conversation with your PDF:
- PDF Loading: The uploaded PDF is loaded and its text content is extracted using
PyMuPDF
. - Text Chunking: The extracted text is split into smaller, manageable chunks using a
RecursiveCharacterTextSplitter
. This is crucial for fitting the context into the language model's limits. - Chunk Refinement: Each text chunk is individually refined by a language model (
gemini-pro
) to improve its clarity and coherence. This step is parallelized for maximum speed. - Embedding and Indexing: The refined chunks are converted into vector embeddings using
GoogleGenerativeAIEmbeddings
and stored in aFAISS
vector store for efficient similarity searches. - Conversational Chain: A
ConversationalRetrievalChain
is created, which uses the vector store as a retriever and a powerful language model as the question-answering engine. This chain is what enables the conversational memory. - User Interaction: The Streamlit interface captures the user's questions, passes them to the conversational chain along with the chat history, and displays the model's response.
Follow these instructions to set up and run the project on your local machine.
- Python 3.7 or higher
- A Google API key. You can obtain one from the Google AI Studio.
-
Clone the repository:
git clone https://github.com/akshat2635/PDF-Bot.git cd PDF-Bot
-
Install the dependencies: It is recommended to create a virtual environment first:
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
Then, install the required packages:
pip install -r requirements.txt
Create a file named .env
in the root of your project and add your Google API key. This file is loaded at runtime to configure the application.
# .env
GOOGLE_API_KEY="YOUR_GOOGLE_API_KEY"
To run the application, execute the following command in your terminal:
streamlit run app.py
This will start the Streamlit server, and you can access the application in your web browser, typically at http://localhost:8501
.
- Backend: Python
- Web Framework: Streamlit
- LLM Orchestration: LangChain
- Language Models: Google Generative AI (Gemini 2.0 Flash)
- Vector Store: FAISS (Facebook AI Similarity Search)
- PDF Processing: PyMuPDF
Contributions are welcome! If you have ideas for new features, improvements, or bug fixes, please feel free to:
- Fork the repository.
- Create a new branch (
git checkout -b feature/YourFeature
). - Make your changes.
- Commit your changes (
git commit -m 'Add some feature'
). - Push to the branch (
git push origin feature/YourFeature
). - Open a pull request.
This project is licensed under the MIT License. See the LICENSE
file for details.