Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Build a Complete AI Product
This is the module that makes the course worth it. You’ll build a production-ready AI application that you can deploy, show to employers, or even monetize. Think of this as the difference between a cooking class where you follow recipes and one where you run the kitchen for a night. Everything you have learned about embeddings, RAG, chunking, and cost optimization comes together here in a single, deployable product. The decisions you make — which model to call, how to chunk documents, when to cache — stop being theoretical and become things that cost you real money or delight real users.What You’ll Build: A multi-tenant AI document assistant that lets users upload documents, ask questions, and get answers with citations. This is the architecture behind Notion AI, ChatPDF, and countless enterprise tools.
Project Overview
DocuMind AI
A SaaS document intelligence platform with:- 📄 Document upload and processing (PDF, DOCX, TXT)
- 🔍 Semantic search across documents
- 💬 AI chat with citations
- 👥 Multi-tenant (users only see their docs)
- 💰 Usage tracking and rate limiting
- 🔐 Authentication and API keys
Tech Stack
| Layer | Technology |
|---|---|
| Frontend | Next.js 14, Tailwind, shadcn/ui |
| Backend | FastAPI, Python 3.11+ |
| Database | PostgreSQL + pgvector |
| LLM | OpenAI GPT-4o |
| Auth | Clerk or NextAuth |
| Deployment | Vercel + Railway |
Architecture
Part 1: Project Setup
Database Schema
The schema below is designed around one core principle: every row of user data must be scoped to a tenant. In a multi-tenant SaaS, accidentally leaking one user’s documents into another user’s search results is a showstopper. That is whyuser_id appears on both documents and document_chunks — the duplication is intentional so that every vector search query can filter by user without an extra JOIN.
FastAPI Backend Structure
Part 2: Document Processing
Part 3: RAG Engine
Part 4: API Routes
Part 5: Frontend (Next.js)
Part 6: Deployment
Before you deploy, a practical tip: run through the entire upload-to-chat flow locally with Docker Compose first. The number-one cause of “it works on my machine” failures in AI apps is missing environment variables (especiallyOPENAI_API_KEY) and mismatched embedding dimensions between what you stored and what you query with.
Docker Setup
Production Deployment
What You’ve Learned
Full-Stack AI Development
Build complete AI products from database to frontend
Production RAG
Implement RAG with chunking, embeddings, and citations
Multi-Tenancy
Handle multiple users with isolated data
Deployment
Deploy and scale AI applications
Tech Stack Decision Framework
The stack above is opinionated. Here is why each choice was made and when you should deviate.| Decision | Default Choice | When to Change | Alternative |
|---|---|---|---|
| Frontend framework | Next.js 14 | Need mobile-first or prefer Vue ecosystem | Nuxt 3, SvelteKit |
| Backend framework | FastAPI | Already have Node team, need WebSockets natively | Express + tRPC, Hono |
| Database | PostgreSQL + pgvector | Over 10M vectors, need sub-10ms p99 latency | Pinecone or Qdrant for vectors, keep Postgres for relational |
| LLM provider | OpenAI GPT-4o | Need longer context (200K+), cost-sensitive at scale | Claude 3.5 Sonnet, Gemini 1.5 Pro |
| Auth provider | Clerk | Self-hosted requirement, existing NextAuth setup | NextAuth, Supabase Auth, Auth0 |
| Deployment | Vercel + Railway | Need GPU inference, on-prem requirement | Fly.io, Render, AWS ECS |
| Embedding model | text-embedding-3-small | Domain-specific vocabulary, data sovereignty | BGE-large (self-hosted), Cohere embed-v3 |
Edge Cases You Will Hit in Production
These are the issues that don’t show up in demos but break real deployments. Plan for them before launch, not after. Scanned PDFs returning empty text. Thepypdf extractor returns empty strings for image-only PDFs. Add a fallback: if extracted text is under 50 characters for a multi-page PDF, run OCR with PyMuPDF’s built-in OCR or Tesseract. Surface a clear status to the user (“Processing with OCR — this may take longer”).
Embedding dimension mismatches. If you change embedding models (e.g., from text-embedding-3-small at 1536 dimensions to text-embedding-3-large at 3072), existing vectors in the database become incompatible. You must re-embed all stored chunks. Add a model_version column to document_chunks and check it at query time.
Conversation history token overflow. The RAG engine keeps the last 6 messages, but a user pasting a 5000-word document as a message will blow through the token limit in a single turn. Add a max_tokens_per_message guard that truncates or summarizes oversized user inputs before they enter the history.
Concurrent document processing. Two uploads arriving simultaneously for the same user can cause race conditions on the usage tracker. Use database-level advisory locks or an idempotency key on the upload endpoint to prevent double-counting.
Multi-tenant data leakage in vector search. The WHERE c.user_id = $2 filter is your security boundary. If this filter is accidentally removed or bypassed by a new query path, User A sees User B’s documents. Add an integration test that explicitly verifies cross-tenant isolation on every search endpoint.
Extend Your Project
Once the core works, each of these extensions teaches you a new production skill. Pick the one closest to the job you want — a voice-input feature demonstrates real-time media handling, while team workspaces demonstrate authorization modeling. Ideas to make it even more impressive:- Add Voice Input: Use Whisper API for voice-to-text
- Multi-Language Support: Translate queries and responses
- Analytics Dashboard: Show usage patterns and popular queries
- Export to Notion/Docs: Let users export conversations
- Team Workspaces: Add collaboration features
- Custom Embeddings: Fine-tune for specific domains
Production Readiness Checklist
Before you call this project “deployed,” walk through this checklist. Each item addresses a real failure mode that has taken down AI SaaS products.| Category | Check | Why It Matters |
|---|---|---|
| Security | API keys stored in env vars, never in code | One leaked key in a git commit costs you thousands |
| Security | CORS restricted to your domain(s) only | allow_origins=["*"] lets any site call your API |
| Security | Rate limiting enabled per user | One runaway script can exhaust your OpenAI quota |
| Data | Embedding model version tracked per chunk | Model changes silently break similarity search |
| Data | Document upload size limit enforced (both client and server) | A 500MB PDF will OOM your worker |
| Reliability | Background job for document processing with retry | Inline processing blocks the API response and fails silently |
| Reliability | Health check endpoint verifies DB and Redis connectivity | /health returning 200 with a dead database is worse than no health check |
| Cost | Usage tracking with daily budget alerts | You will forget about this until the invoice arrives |
| Cost | GPT-4o-mini used for classification/routing, GPT-4o for generation only | 15x cost difference for tasks where quality is equivalent |
| Observability | Request logging with user ID, model, token count, latency | You cannot debug what you cannot see |
Portfolio Ready
This project demonstrates:- End-to-end AI product development
- Production architecture patterns
- Modern tech stack proficiency
- Database design with vectors
- API design and authentication
- Frontend development
- Deployment and DevOps
Pro Tip: Deploy this project, add it to your resume, and link your GitHub. This single project can be your ticket to AI engineering roles.