Pro Ingestion: High-Accuracy PrimeCut Pipeline

PrimeCut gives you four main ingestion entry points:

Method	Use it when
`submit(...)`	You want the job ID and plan to collect later
`collect(...)`	You already have a job ID and want typed results
`poll(...)`	You need the legacy archive-returning polling helper
`ingest(...)`	You want the full submit-and-wait flow in one call

Recommended flow

python

from poma import PrimeCut

client = PrimeCut()
result = client.ingest("example.pdf", show_progress=True)

Submit first and collect later

python

from poma import PrimeCut

client = PrimeCut()
job_id = client.submit("example.pdf")

# You can store the job ID and collect later.
result = client.collect(job_id, show_progress=True)

Base URL handling

Pass base_url when the input file contains relative links that should resolve against a known origin:

python

job_id = client.submit("page.html", base_url="https://docs.example.com")

Collection behavior

collect(...) tries the status SSE stream first. If streaming is unavailable or drops, the client falls back to adaptive polling automatically.

poll(...) is still available for compatibility. It increases its polling interval while a job stays pending or processing and returns PomaArchive, not raw JSON.

Grill

Getting started

Concepts

Reference

PrimeCut

Getting started

Concepts

Reference

Python SDK

Getting started

Concepts

Reference

Integrations

Migration

CLI

MCP

Learn (study path)

Chunking

Ingestion

Pro Ingestion: High-Accuracy PrimeCut Pipeline

Recommended flow

Submit first and collect later

Base URL handling

Collection behavior

Chunking

Ingestion

Pro Ingestion: High-Accuracy PrimeCut Pipeline ​

Recommended flow ​

Submit first and collect later ​

Base URL handling ​

Collection behavior ​

Pro Ingestion: High-Accuracy PrimeCut Pipeline

Recommended flow

Submit first and collect later

Base URL handling

Collection behavior