DocFlow addresses the challenge of auditing thousands of pages of regulatory documents. It leverages GraphRAG and OpenAI for hybrid context retrieval using Knowledge Graphs (Neo4J), Vector Embeddings, and SQL databases.
The frontend serves as a SaaS platform for organizations to edit numerous documents and manage multiple levels of approvals within and across cross-functional teams, automated via custom workflows.
GraphRAG-only Kaggle Script: https://www.kaggle.com/code/techpertz/docflow
NOTE: No Agentic or High-Level Framework like LangChain, LangGraph, etc has been used in this project for maximum learning purposes.
- Frontend: Next.js application with modern UI/UX
- Backend: FastAPI service with Python
- Database: Neo4j graph database
- LLM: OpenAI
- RAG - Semantic Chunking + Similarity, BART Summarization
# Start Neo4j using docker-compose
docker-compose up neo4j
Access Neo4j Browser at http://localhost:7474
- Username: neo4j
- Password: mypassword123
# Create and activate virtual environment
cd Backend
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Download spaCy model
python -m spacy download en_core_web_lg
# Setup environment
cp .env.example .env
# Edit .env and add your OpenAI API key
# Start the server
uvicorn app.main:app --reload --port 8000
Backend API will be available at http://localhost:8000 Backend API endpoints will be available at http://localhost:8000/docs
# Install dependencies
cd Frontend
npm install
# Run initial setup (for SQLite database)
npm run setup
# Start development server
npm run dev
Frontend will be available at http://localhost:3000
- Python 3.9+
- Node.js 18+
- Docker (for Neo4j)
- OpenAI API key
For testing purposes, the following user accounts are pre-configured:
Gmail Organization:
Yahoo Organization:
No Password required.
.
├── Frontend/ # Next.js application
├── Backend/ # FastAPI application
├── docker-compose.yml # Docker composition (for Neo4j)
└── README.md # This file
OPENAI_API_KEY=your_key_here
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=mypassword123
- Keep each component (Frontend, Backend, Neo4j) running in separate terminal windows
- Backend requires the virtual environment to be activated for each new terminal session
- Neo4j data persists in Docker volumes between restarts