DR / / drodriguez.site
case study / / 02 / /

AI document Q&A with RAG

Cited answers via Claude + pgvector

statuslive
stackAnthropic Claude, Voyage AI, pgvector
updated2026
AnthropicpgvectorRAGStreaming
01 / / the problem

Most AI integrations on Upwork are basic chat wrappers. Real value is in RAG: connecting LLMs to private data with citations. This shows the full pipeline — chunking, embedding, retrieval, streaming — running end-to-end.

02 / / what i built
PDF and text file upload (10 MB limit)
Document chunking with embedding generation
Vector similarity search via pgvector
Streaming response token-by-token
Inline citations linking back to source passages
Per-user rate limiting
Token usage and cost tracking
Pre-loaded sample documents for instant demo
03 / / how i built it
Anthropic Claude
Best-in-class instruction following for RAG synthesis
Voyage AI
Cheaper embeddings than OpenAI, comparable quality
pgvector
Embeddings in same database as metadata, simpler ops
Vercel AI SDK
Stream handling without writing SSE plumbing
04 / / live demo
→ open live demo at https://ai.drodriguez.site
Loom walkthrough — 90 seconds

Demo credentials shown on the demo's landing page.

05 / / production extensions

Things deliberately out of scope for the demo, but I'd add for production:

OCR for image-based PDFs
Hybrid search combining keyword and semantic
Chunking strategy tuning per document type
Conversation memory across queries