Meet Document Summarizer Llama7b—your new best friend for turning dense documents into digestible nuggets of wisdom. Born from a caffeine-fueled, 12-hour hackathon sprint by a dynamic team (Amit, Adith, Saai, and Mithileshan), this tool dives deep into PDFs, images, Excel sheets, and CSV files with the prowess of a linguistic llama on a mission.
Not only does it summarize your documents with style, but it also comes equipped with a conversational Q&A interface. And with its nifty web extension, you can even chat with websites—making the internet spill its secrets just for you!
-
Document Processing: Extracts text from PDFs (using
pypdf), images (viapytesseractand PIL), and tabular data from Excel or CSV files (withpandas). -
Conversational AI:
Summarizes documents and generates key questions using a robust language model (powered by the Groq API and models likegemma2-9b-it). -
Interactive Interface: Runs on Streamlit, ensuring a smooth and engaging user experience.
-
Automatic Charts: When uploading CSV or Excel files, the app displays quick visualizations for numeric columns.
-
Web Extension:
Interrogate websites directly to extract and summarize online content.
-
Clone repository: git clone https://github.com/AmitSubhash/Document_Summarizer_Llama7b.git cd Document_Summarizer_Llama7b
-
Install dependencies:
pip install -r requirements.txt
Launch the interface: streamlit run app.py Supported operations:
- Upload document (PDF/image/Excel/CSV)
- Get auto-generated summary
- Ask follow-up questions via chat
- View quick charts for numeric columns in CSV/Excel files
- Ensure Python 3.10+ is installed.
- Install the required packages:
pip install -r requirements.txt. - Start the application with
streamlit run app.py. - Upload a PDF, image, Excel or CSV file and interact with the chat interface for analysis.
Contribution Welcome
Open to feature requests, bug reports, and PRs through GitHub issues.
License
Open-source (free for educational/personal use)
Team
Amit Subhash, Adith, Saai, Mithileshan - Luddy Hackathon 2024