Skip to content
python
from poma import PomaArchive

PomaArchive wraps a raw .poma archive. A .poma archive is a zip file that contains the structured SDK output.

In practice, the core archive files are chunks.json and chunksets.json, with optional helpers such as image_sources.json and files inside assets/. Some archives also include additional processing artifacts like final_input.md, metadata_archive.json, metadata_content.json, tables/, and input_prepared/.

Archive contents

A .poma file is a ZIP archive with the following structure. Click an item to see its format.

Core SDK files

These are the files the SDK relies on when unpacking into `PomaResult`.

Individual chunk records

Contains the extracted chunk objects. These hold the original text, hierarchy depth, page reference, file identity, and optional links back to tables or images.

[
  {
    "content": "## Results",
    "depth": 1,
    "poma_page": 8,
    "chunk_index": 42,
    "file_id": "report-20260317",
    "parent_chunk_index": 41,
    "table": "table_00003",
    "image_name": "image_00002.jpeg",
    "to_embed": "Results"
  }
]

Extended processing artifacts

These can be present in richer archive exports, such as the ones shown in the POMA app.

Constructor

python
PomaArchive(
    data: bytes | None = None,
    path = None,
)

Pass either:

  • data for in-memory archive bytes
  • path for an archive on disk

Methods

unpack

python
unpack() -> PomaResult

Unpack the archive and return typed PomaResult data.

unpack() focuses on typed chunks, chunksets, and images, even when the archive also contains extra processing artifacts.

Example

python
from poma import PomaArchive

archive = PomaArchive(path="store/example.poma")
result = archive.unpack()