case study / / 02 / /

AI document Q&A with RAG

Cited answers via Claude + pgvector

statuslive

stackAnthropic Claude, Voyage AI, pgvector

updated2026

AnthropicpgvectorRAGStreaming

→ open live demo → book a call about this

01 / / the problem

Most AI integrations on Upwork are basic chat wrappers. Real value is in RAG: connecting LLMs to private data with citations. This shows the full pipeline — chunking, embedding, retrieval, streaming — running end-to-end.

02 / / what i built

→PDF and text file upload (10 MB limit)

→Document chunking with embedding generation

→Vector similarity search via pgvector

→Streaming response token-by-token

→Inline citations linking back to source passages

→Per-user rate limiting

→Token usage and cost tracking

→Pre-loaded sample documents for instant demo

03 / / how i built it

Anthropic Claude

Best-in-class instruction following for RAG synthesis

Voyage AI

Cheaper embeddings than OpenAI, comparable quality

pgvector

Embeddings in same database as metadata, simpler ops

Vercel AI SDK

Stream handling without writing SSE plumbing

04 / / live demo

→ open live demo at https://ai.drodriguez.site

Loom walkthrough — 90 seconds

Demo credentials shown on the demo's landing page.

05 / / production extensions

Things deliberately out of scope for the demo, but I'd add for production:

→OCR for image-based PDFs

→Hybrid search combining keyword and semantic

→Chunking strategy tuning per document type

→Conversation memory across queries