AI document Q&A with RAG
Cited answers via Claude + pgvector
statuslive
stackAnthropic Claude, Voyage AI, pgvector
updated2026
AnthropicpgvectorRAGStreaming
01 / / the problem
Most AI integrations on Upwork are basic chat wrappers. Real value is in RAG: connecting LLMs to private data with citations. This shows the full pipeline — chunking, embedding, retrieval, streaming — running end-to-end.
02 / / what i built
→PDF and text file upload (10 MB limit)
→Document chunking with embedding generation
→Vector similarity search via pgvector
→Streaming response token-by-token
→Inline citations linking back to source passages
→Per-user rate limiting
→Token usage and cost tracking
→Pre-loaded sample documents for instant demo
03 / / how i built it
Anthropic Claude
Best-in-class instruction following for RAG synthesis
Voyage AI
Cheaper embeddings than OpenAI, comparable quality
pgvector
Embeddings in same database as metadata, simpler ops
Vercel AI SDK
Stream handling without writing SSE plumbing
04 / / live demo
→ open live demo at https://ai.drodriguez.site
Loom walkthrough — 90 seconds
Demo credentials shown on the demo's landing page.
05 / / production extensions
Things deliberately out of scope for the demo, but I'd add for production:
→OCR for image-based PDFs
→Hybrid search combining keyword and semantic
→Chunking strategy tuning per document type
→Conversation memory across queries