Personal Documents AI Assistant
Production-ready RAG system with enterprise-grade architecture
A secure, retrieval-augmented generation system that provides instant, source-backed answers from internal documentation. Built with modern AI infrastructure and optimized for accuracy and performance.
See It In Action
Watch how DocAI processes documents and answers questions
Demo video coming soon
Core Capabilities
Enterprise-grade features built for accuracy, security, and performance
Vector Search
Semantic retrieval using cosine similarity on high-dimensional embeddings stored in PostgreSQL with pgvector extension.
- 1536-dimensional vectors
- Cosine similarity matching
- Top-K retrieval with thresholds
Source Attribution
Every answer includes verifiable citations with document titles, content previews, and similarity scores for complete transparency.
- Document traceability
- Confidence scoring (≥30%)
- Content preview snippets
Real-time Indexing
Instant document processing with intelligent chunking that preserves semantic context at boundaries.
- ~500 word chunks
- 50 word overlap
- Context preservation
Chat History
Maintains conversational context across messages for coherent multi-turn interactions.
- Persistent sessions
- Context-aware responses
- Conversation threading
Secure Authentication
Enterprise-grade authentication with Clerk ensuring your data stays private and secure.
- User isolation
- Session management
- Role-based access
Multi-format Support
Upload documents in various formats with automatic parsing and text extraction.
- .txt, .md, .csv, .json
- Automatic parsing
- UTF-8 encoding support
How RAG Works
Retrieval-Augmented Generation combines semantic search with large language models to provide accurate, source-backed answers
Upload Document
.txt, .md, .csv, .jsonText Chunking
500 words, 50 overlapGenerate Embeddings
1536-dim vectorsStore in pgvector
PostgreSQL + vectorsUser Question
"What is...?"Embed Query
Same embedding modelSimilarity Search
Top 5 chunks, ≥30%LLM + Context
gpt-4o-miniTechnical Architecture
Built on modern AI infrastructure with production-grade components
1Document Ingestion Pipeline
2Query & Response Pipeline
Implementation Details
Key technical decisions and optimizations
Document Processing
- ~500-word chunks with 50-word overlap
- Preserves context at boundaries
- Handles multiple file formats
- UTF-8 encoding support
Vector Operations
- 1536-dimensional embeddings
- OpenAI text-embedding-3-small
- Cosine similarity matching
- 30% similarity threshold
Database & Storage
- PostgreSQL with pgvector extension
- Prisma ORM for type safety
- Efficient vector indexing
- User-isolated data stores
Response Generation
- GPT-4o-mini for cost efficiency
- Top-5 chunk retrieval
- Conversational context management
- Source attribution in responses