This is a Streamlit application that enables you to chat with the content of a PDF file using state-of-the-art NLP techniques! With this app, you can upload a PDF file, ask questions about its content, and get intelligent answers in real time.
You can follow this video tutorial to see how to set up and use the app!
Alternative hosting:
If you cannot view the embedded video: Download Video
- 📓 PDF Parsing: Extracts content from PDF files for processing.
- 🤗 Hugging Face Embeddings: Utilizes advanced NLP embeddings using the
Instructor-based
model. - 🔍 Retrieval QA Pipeline: Implements question-answering over your PDF content.
- 🛠 Streamlit UI: A friendly, interactive user interface for uploading PDFs and asking questions.
git clone /~https://github.com/FDC0178/chat-with-pdf.git
cd chat-with-pdf
python -m venv venv
source venv/bin/activate # For MacOS/Linux
venv\Scripts\activate # For Windows
pip install -r requirements.txt
Add the Hugging Face API key in your secrets.toml
file:
[default]
HUGGINGFACEHUB_API_TOKEN = "your-huggingface-token"
Alternatively, set this in the Streamlit Secrets section (if hosted on Streamlit Cloud).
streamlit run app.py
- Upload your PDF file via the file uploader.
- Type in your questions in the text input box.
- The app provides intelligent answers and extracts the relevant content from the PDF.
We'd like to express our gratitude to the following resources for enabling this project:
- 🤗 Hugging Face for pre-trained NLP models.
- 📓 LangChain for retrieval-based NLP pipelines.
- 🎨 Streamlit for its easy-to-use app development framework.
- 🗂 FAISS for efficient similarity search and clustering.
Here are some helpful links related to the project:
- Hugging Face Instructor-based Model Documentation
- LangChain Documentation
- Streamlit Documentation
- FAISS Documentation
We welcome contributions! Please fork the repo and make a pull request.
This project is licensed under the MIT License.