Welcome to The RAG School
Where AI looks up real facts before writing the answer.
Enter The SchoolWhy AI Makes Things Up
Without access to real documents, AI fills gaps with plausible-sounding guesses. This is called hallucination.
RAG gives AI a open-book exam β it can look up the real answer before writing anything!
The 3-Step RAG Pipeline
Step through every stage of how RAG works.
Splitting Knowledge Into Chunks
Before storing documents, we split them into chunks. Chunk size affects search quality.
Good balance of precision & context. β Recommended
Vector Search In Action
Type a question β watch semantic search find relevant chunks even without exact-word matching.
Our refund policy allows returns within 14 days for store credit.
policyAll payments are processed securely via Stripe.
paymentYou can upgrade your plan anytime from the billing settings page.
billingRefunds for digital products are not available.
policyAnnual subscriptions include a 20% discount.
billingContact support for billing questions: billing@company.com
billingRAG In Code
Hover any glowing token to understand what it does.
Mission: Build the RAG Pipeline
Tap the steps in the correct order to assemble a working RAG pipeline.
Build your first RAG system π
Four steps from zero to a document-grounded chatbot.
- 1
Install dependencies
Grab the OpenAI library and ChromaDB for local vector storage.
pip install openai chromadb
- 2
Chunk & embed your docs
Split your documents into chunks, then turn each chunk into a vector.
import openai, chromadb client = chromadb.Client() col = client.create_collection("docs") chunks = ["Policy: 14-day returns...", "Billing via Stripe..."] for i, chunk in enumerate(chunks): emb = openai.embeddings.create( model="text-embedding-3-small", input=chunk ).data[0].embedding col.add(ids=[str(i)], embeddings=[emb], documents=[chunk]) - 3
Search by meaning
Embed the user's question and find the closest chunks.
question = "Can I get a refund?" q_emb = openai.embeddings.create( model="text-embedding-3-small", input=question ).data[0].embedding results = col.query( query_embeddings=[q_emb], n_results=3 ) chunks = results["documents"][0] - 4
Generate a grounded answer
Pass the retrieved chunks as context so the LLM answers from facts.
context = "\n".join(chunks) response = openai.chat.completions.create( model="gpt-4o-mini", messages=[ {"role":"system","content":f"Answer using: {context}"}, {"role":"user","content": question} ] ) print(response.choices[0].message.content)
Chat about RAG β¨
Questions about chunking, embeddings, vector databases, or when to use RAG vs fine-tuning?
You're a RAG Researcher now!
You know how to ground AI in real documents β no more hallucinations. Next: learn how to structure AI data with Pydantic.
Grounded Answerer
Deliverable: Retrieve relevant chunks and answer with citations to source snippets.
Stretch: Refuse politely when retrieval confidence is low.
Complete the deliverable first, then unlock the stretch goal.