Skip to content

POMA chunksets

POMA changes the retrieval unit rather than just choosing different cut points. Instead of returning a chunk that may start mid-thought, POMA returns a chunkset: a complete, unbreakable root-to-leaf path through the document hierarchy.

A chunkset contains the leaf sentences you care about together with the breadcrumb trail that tells you what those sentences mean in context.

What a chunkset looks like

Traditional chunking can split a section header away from its content:

  • Chunk 1: ...end of paragraph. 3. Health Insurance
  • Chunk 2: Employees are eligible for...
  • Chunk 3: ...enrollment deadline is December 15. 4. Dental Coverage

A POMA chunkset keeps the breadcrumbs attached:

  • Chunkset: Employee Handbook -> Benefits -> Health Insurance -> Employees are eligible for... enrollment deadline is December 15.

How POMA builds chunksets

POMA's pipeline is:

  1. Parse the document into a clean sentence-by-sentence structure.
  2. Identify hierarchy by assigning each sentence a depth in a tree representation, based on explicit and implicit structure.
  3. Group sentences into chunksets, which are complete and unbreakable root-to-leaf paths.

The retrieved text therefore arrives with its lineage attached, such as section, subsection, procedure, or requirement.

Cheatsheets: query-time compression without losing lineage

At query time, POMA assembles the relevant chunksets and compiles them into a per-document cheatsheet: a single, deduplicated, structured block of text optimized for LLM consumption.

That lets the model work with fewer tokens while keeping the context that explains why the retrieved sentence matters.

To use one example from the original guide, a legal-document query about Andorra's personalized license-plate law needed 1,542 tokens of retrieved context with traditional RAG versus 337 tokens with POMA, with no information loss.

TL;DR

POMA chunksets keep section headers and surrounding hierarchy attached to the retrieved sentence. Cheatsheets then compress those chunksets into structured, deduplicated context for the model.

Continue with the strategy comparison or return to the RAG chunking guide.

Continue reading