POMA chunksets

POMA changes the retrieval unit rather than just choosing different cut points. Instead of returning a chunk that may start mid-thought, POMA returns a chunkset: a complete, unbreakable root-to-leaf path through the document hierarchy.

A chunkset contains the leaf sentences you care about together with the breadcrumb trail that tells you what those sentences mean in context.

What a chunkset looks like

Traditional chunking can split a section header away from its content:

Chunk 1: ...end of paragraph. 3. Health Insurance
Chunk 2: Employees are eligible for...
Chunk 3: ...enrollment deadline is December 15. 4. Dental Coverage

A POMA chunkset keeps the breadcrumbs attached:

Chunkset: Employee Handbook -> Benefits -> Health Insurance -> Employees are eligible for... enrollment deadline is December 15.

How POMA builds chunksets

POMA's pipeline is:

Parse the document into a clean sentence-by-sentence structure.
Identify hierarchy by assigning each sentence a depth in a tree representation, based on explicit and implicit structure.
Group sentences into chunksets, which are complete and unbreakable root-to-leaf paths.

The retrieved text therefore arrives with its lineage attached, such as section, subsection, procedure, or requirement.

Cheatsheets: query-time compression without losing lineage

At query time, POMA assembles the relevant chunksets and compiles them into a per-document cheatsheet: a single, deduplicated, structured block of text optimized for LLM consumption.

That lets the model work with fewer tokens while keeping the context that explains why the retrieved sentence matters.

To use one example from the original guide, a legal-document query about Andorra's personalized license-plate law needed 1,542 tokens of retrieved context with traditional RAG versus 337 tokens with POMA, with no information loss.

TL;DR

POMA chunksets keep section headers and surrounding hierarchy attached to the retrieved sentence. Cheatsheets then compress those chunksets into structured, deduplicated context for the model.

Continue with the strategy comparison or return to the RAG chunking guide.

Continue reading

Strategy comparison — see how chunksets compare to all other approaches
Common failure modes — the problems chunksets are designed to solve
The full chunking guide — detailed narrative covering all strategies
PrimeCut — try it free — 1,000 pages free, no credit card

Grill

Getting started

Concepts

Reference

PrimeCut

Getting started

Concepts

Reference

Python SDK

Getting started

Concepts

Reference

Integrations

Migration

CLI

MCP

Learn (study path)

Chunking

Ingestion

POMA chunksets

What a chunkset looks like

How POMA builds chunksets

Cheatsheets: query-time compression without losing lineage

Continue reading

Chunking

Ingestion

POMA chunksets ​

What a chunkset looks like ​

How POMA builds chunksets ​

Cheatsheets: query-time compression without losing lineage ​

Continue reading ​

POMA chunksets

What a chunkset looks like

How POMA builds chunksets

Cheatsheets: query-time compression without losing lineage

Continue reading