Skip to content

Cheatsheets: Structure-Preserving RAG Context

Cheatsheets turn retrieved chunksets into concise, structure-aware prompt context.

Why cheatsheets exist: the chunking/retrieval alignment problem

Hierarchical chunking creates coherent boundaries — chapter → section → paragraph — but those boundaries only pay off if the retrieval layer knows the structure exists. Parent-document retrieval, metadata filters, and structured search tied to a chunk hierarchy all require the vector store, the reranker, and the prompt assembler to be hierarchy-aware. Plug a hierarchical chunker into a flat retrieval pipeline and you still get context windows that miss the surrounding signal: the structure is in the index but invisible at query time.

Cheatsheets are POMA's answer to that alignment cost. Instead of asking the retrieval layer to understand hierarchy, we push structural awareness into a doc-level artifact:

  1. The retrieval layer can stay simple — it just returns chunksets relevant to the query.
  2. generate_cheatsheets(...) assembles those chunksets back into the document's lineage at query time, deduplicating and ordering so each cheatsheet reads as a coherent, prompt-ready block.
  3. The LLM sees structure in the text, not in the index — no parent-doc joins, no metadata filters, no hierarchy-aware reranker required.

Net result: you keep the precision wins of hierarchical chunking without paying the integration tax of a hierarchy-aware retrieval stack.

Usage

Use the top-level generate_cheatsheets(...) helper for new code:

python
from poma import PrimeCut, generate_cheatsheets

client = PrimeCut()
result = client.ingest("example.pdf")

relevant_chunksets = [result.chunksets[0].to_dict()]
all_chunks = [chunk.to_dict() for chunk in result.chunks]

cheatsheets = generate_cheatsheets(
    relevant_chunksets=relevant_chunksets,
    all_chunks=all_chunks,
)

cheatsheet = cheatsheets[0]["content"]
print(cheatsheet)

When your retrieved chunksets span multiple documents, generate_cheatsheets(...) returns one cheatsheet per document:

python
cheatsheets = generate_cheatsheets(
    relevant_chunksets=relevant_chunksets,
    all_chunks=all_chunks,
)

PrimeCut.create_cheatsheet(...) and PrimeCut.create_cheatsheets(...) still exist for compatibility, but both are deprecated.

Input rules

  • relevant_chunksets must contain a chunks list
  • all_chunks must contain the matching chunk content
  • file_id should identify the source document
  • Duplicate chunk_index values within one document will fail validation

This same cheatsheet logic is also used by the bundled Qdrant, LangChain, and LlamaIndex integrations.