RAG Chunking Playground
Drop in any text and compare chunking strategies — fixed-size, recursive, by-sentence, by-paragraph — with overlap highlighted and an estimated token count per chunk. Stop guessing your chunk size; see exactly how your RAG pipeline will split a document.
Retrieval-augmented generation (RAG) grounds a language model in your own data.Before a model can retrieve a document, the document must be split into chunks and embedded.Chunk size is a balance.Large chunks preserve context but dilute relevance and burn context tokens at query time.
Small chunks are precise but can lose the surrounding meaning a passage depends on.Overlap repeats a little text across chunk boundaries.Without it, a sentence split across two chunks may become unretrievable for either one.
A typical starting point is a few hundred tokens per chunk with ten to twenty percent overlap.The right strategy depends on your content.Prose benefits from sentence or paragraph splitting; code or logs often suit fixed-size windows.Try a few here and watch how the chunks change.
How it works
- Fixed: cut every N characters, optionally with character overlap.
- Recursive: prefer paragraph → sentence → word boundaries under the size cap.
- By sentence / paragraph: split on natural language structure.
- Token counts are estimated (~4 chars/token) — close enough to size chunks.
Frequently asked questions
What is chunking in RAG?
Chunking is splitting a source document into smaller pieces before embedding them for retrieval. Chunk size and overlap directly affect retrieval quality: too large and you dilute relevance and waste context tokens; too small and you lose the surrounding meaning a passage needs.
What chunk size and overlap should I use?
A common starting point is 200–500 tokens per chunk with 10–20% overlap, but the right values depend on your content and embedding model. This tool lets you try sizes and strategies on your own text so you can see the trade-off instead of guessing.
What is the difference between fixed and recursive chunking?
Fixed chunking cuts every N characters/tokens regardless of structure, which can split sentences mid-thought. Recursive chunking tries to break on natural boundaries first — paragraphs, then sentences, then words — so chunks stay coherent while staying under the size limit.
Why does overlap between chunks matter?
Overlap repeats a little text at the boundary of adjacent chunks so a fact that straddles a split is not lost to retrieval. Without overlap, a sentence cut in half can become unretrievable for either chunk.