AI ENGINEERING

RAG & Retrieval

Retrieval-augmented generation end to end — chunking, embeddings, similarity search and ranking — across a system design, a hands-on tool, and runnable challenges.

6 pieces · 4 formats

System Designs 1

System Design

Design a Search Engine

Build a web search engine step by step. See how an inverted index turns search into list lookups, how query parsing mirrors indexing, how an offline indexer builds postings, how BM25 and PageRank rank results, and how document-sharded scatter-gather, result caching and a fresh index scale it to billions of pages — through an interactive diagram that grows with each concept.

IndexingRankingScalability

AI System Designs 2

AI System Design

Design a Conversational AI

Build a production conversational AI system (think ChatGPT) step by step. See how the request path splits an inference gateway from the model servers, how the context window is assembled and token-budgeted, how conversation memory is stored and recalled, how tokens stream back over a persistent connection, and how guardrails gate every prompt and response — through an interactive diagram that grows with each concept.

LLMInferenceStreaming

AI System Design

Design a RAG Pipeline

Build a retrieval-augmented generation pipeline step by step. See how documents are chunked and embedded, how a vector store answers semantic search, how two-stage retrieval with reranking finds the best passages, how the prompt is grounded to stop hallucination, and how evals keep a quietly-drifting index honest — through an interactive diagram that grows with each concept.

RAGRetrievalEmbeddings

Coding Challenges 2

Challenge

Cosine Similarity

The measure behind every embedding search and RAG system: how aligned are two vectors, ignoring their length? Dot product over the product of magnitudes — 1 identical, 0 orthogonal, -1 opposite. Solve it in Python or TypeScript.

AI EngineeringEmbeddingsMath

Challenge

Top-K Retrieval

The core of the "R" in RAG: given a query embedding and a set of document embeddings, return the indices of the k most similar docs by cosine similarity, with a stable tie-break. Solve it in Python or TypeScript.

AI EngineeringRAGRetrieval

Interactive Tools 1

Tool

RAG Chunking Playground

Drop in any text and compare chunking strategies — fixed-size, recursive, by-sentence, by-paragraph — with overlap highlighted and an estimated token count per chunk. Stop guessing your chunk size; see exactly how your RAG pipeline will split a document.

AIRAGLLM

More in AI Engineering

LLM Engineering → AI Agents & Tools → AI Evaluation → ML Foundations →

← Browse all topics