LLM Engineering
Building with large language models — prompting, decoding, inference serving and cost — from the handbooks down to the softmax that powers every token.
Handbooks 2
The Prompting Handbook
A friendly, hands-on field guide for everyday humans — learn the CRISP framework, spot bad prompts, practice with real recipes, play a drag-and-drop game, and test yourself with a quiz. No code required.
The Senior AI Engineer Interview Handbook
60 questions across architecture, production incidents, agentic systems, RAG, evals, cost, safety, and leadership — what staff-level AI interviewers actually probe for.
Roadmaps 1
AI System Designs 2
Design a Conversational AI
Build a production conversational AI system (think ChatGPT) step by step. See how the request path splits an inference gateway from the model servers, how the context window is assembled and token-budgeted, how conversation memory is stored and recalled, how tokens stream back over a persistent connection, and how guardrails gate every prompt and response — through an interactive diagram that grows with each concept.
Design an LLM Inference Server
Build an LLM inference serving system step by step. See how a request queue absorbs spiky traffic, how the prefill/decode split and continuous batching keep GPUs full, how the KV cache and paged attention make each token cheap, how tensor sharding fits a giant model, and how autoscaling rides demand — all balancing latency against throughput, through an interactive diagram that grows with each concept.
Coding Challenges 1
Interactive Tools 4
RAG Chunking Playground
Drop in any text and compare chunking strategies — fixed-size, recursive, by-sentence, by-paragraph — with overlap highlighted and an estimated token count per chunk. Stop guessing your chunk size; see exactly how your RAG pipeline will split a document.
Context Budget & Cost Planner
Add a system prompt, tool definitions, conversation history and retrieved context, then see your context window fill up and the cost per call — plus the bill at 1k and 1M requests — across model price tiers. An architecture planner, not a toy token counter.
LLM-as-Judge Rubric Builder
Define your evaluation criteria and a scoring scale, then generate a clean, copy-pasteable LLM-as-judge prompt you can drop into your eval pipeline — with the common pitfalls (position bias, verbosity bias, ties) called out. Turns eval theory into a prompt you can ship.
Tool-Schema Designer
Compose a tool/function definition field by field — name, description, parameters, required flags — and export valid tool-use JSON for the Claude and OpenAI formats, with the JSON Schema generated for you. Stop hand-writing function-calling schemas and fighting silent validation errors.