Personal Documents AI Assistant

Production-ready RAG system with enterprise-grade architecture

A secure, retrieval-augmented generation system that provides instant, source-backed answers from internal documentation. Built with modern AI infrastructure and optimized for accuracy and performance.

Next.js 15TypeScriptPostgreSQL + pgvectorPrisma ORMOpenAI APIClerk AuthTailwind CSS

See It In Action

Watch how DocAI processes documents and answers questions

Demo video coming soon

Core Capabilities

Enterprise-grade features built for accuracy, security, and performance

🔍

Vector Search

Semantic retrieval using cosine similarity on high-dimensional embeddings stored in PostgreSQL with pgvector extension.

  • 1536-dimensional vectors
  • Cosine similarity matching
  • Top-K retrieval with thresholds
📎

Source Attribution

Every answer includes verifiable citations with document titles, content previews, and similarity scores for complete transparency.

  • Document traceability
  • Confidence scoring (≥30%)
  • Content preview snippets

Real-time Indexing

Instant document processing with intelligent chunking that preserves semantic context at boundaries.

  • ~500 word chunks
  • 50 word overlap
  • Context preservation
💬

Chat History

Maintains conversational context across messages for coherent multi-turn interactions.

  • Persistent sessions
  • Context-aware responses
  • Conversation threading
🔒

Secure Authentication

Enterprise-grade authentication with Clerk ensuring your data stays private and secure.

  • User isolation
  • Session management
  • Role-based access
📄

Multi-format Support

Upload documents in various formats with automatic parsing and text extraction.

  • .txt, .md, .csv, .json
  • Automatic parsing
  • UTF-8 encoding support

How RAG Works

Retrieval-Augmented Generation combines semantic search with large language models to provide accurate, source-backed answers

1

Upload Document

.txt, .md, .csv, .json
2

Text Chunking

500 words, 50 overlap
3

Generate Embeddings

1536-dim vectors
4

Store in pgvector

PostgreSQL + vectors
5

User Question

"What is...?"
6

Embed Query

Same embedding model
7

Similarity Search

Top 5 chunks, ≥30%
8

LLM + Context

gpt-4o-mini

Technical Architecture

Built on modern AI infrastructure with production-grade components

1Document Ingestion Pipeline

Upload
Parse
Chunk
Embed
Store

2Query & Response Pipeline

Query
Vectorize
Search
Retrieve
Generate

Implementation Details

Key technical decisions and optimizations

Document Processing

  • ~500-word chunks with 50-word overlap
  • Preserves context at boundaries
  • Handles multiple file formats
  • UTF-8 encoding support

Vector Operations

  • 1536-dimensional embeddings
  • OpenAI text-embedding-3-small
  • Cosine similarity matching
  • 30% similarity threshold

Database & Storage

  • PostgreSQL with pgvector extension
  • Prisma ORM for type safety
  • Efficient vector indexing
  • User-isolated data stores

Response Generation

  • GPT-4o-mini for cost efficiency
  • Top-5 chunk retrieval
  • Conversational context management
  • Source attribution in responses