Skip to content
python
from poma.integrations.llamaindex import (
    PomaCheatsheetRetrieverLI,
    PomaChunksetNodeParser,
    PomaFileReader,
)

PomaFileReader

python
PomaFileReader()

Load one file or every supported file under a directory into LlamaIndex Document objects.

Method:

  • load_data(input_path: str | Path) -> list[Document]

Behavior notes:

  • Each output Document includes metadata["source_path"] and metadata["doc_id"].
  • PDF files are represented with empty text; the actual ingestion happens later through PrimeCut.
  • Unsupported or unreadable binary files are skipped.

PomaChunksetNodeParser

python
PomaChunksetNodeParser(*, client: PrimeCut)

Call the POMA API for each input document and return chunkset nodes.

Use the standard parser entrypoint:

  • get_nodes_from_documents(documents, show_progress: bool = False) -> list[BaseNode]

Behavior notes:

  • Input documents must include a valid metadata["source_path"].
  • Output nodes are TextNode values containing chunkset text.
  • Output metadata includes doc_id, chunkset_index, chunkset, chunks, and source_path.
  • The parser excludes metadata fields from embeddings so only chunkset content is embedded.

PomaCheatsheetRetrieverLI

python
PomaCheatsheetRetrieverLI(base: BaseRetriever)

Wrap an existing LlamaIndex retriever and turn grouped hits into cheatsheet nodes.

Methods:

  • as_query_engine(**kwargs)
  • standard retriever .retrieve(...) flow

Behavior notes:

  • Retrieval groups hits by doc_id.
  • Each grouped result becomes one cheatsheet TextNode.
  • Returned NodeWithScore values keep the best score seen for that document.