Chunking jobs

v1 uses the original background-job flow built around Poma.

Method	Use it when
`start_chunk_file(...)`	You want to submit a file and receive a job ID
`get_chunk_result(...)`	You want to poll for the finished result and optional archive download

Typical flow

python

from poma import Poma

client = Poma()
start_result = client.start_chunk_file("example.pdf")
job_id = start_result.get("job_id")

if not job_id:
    raise RuntimeError("Failed to receive job ID from server.")

result = client.get_chunk_result(job_id, show_progress=True)
chunks = result.get("chunks", [])
chunksets = result.get("chunksets", [])

print(f"Processed {len(chunks)} chunks and {len(chunksets)} chunksets.")

Base URL handling

Pass base_url when the input file contains relative links that should resolve against a known origin:

python

start_result = client.start_chunk_file(
    "page.html",
    base_url="https://docs.example.com",
)
job_id = start_result["job_id"]

Polling behavior

get_chunk_result(...) polls until the job completes and returns plain dictionary data.

If you pass download_dir and filename, it also stores the downloaded .poma archive on disk while returning the parsed JSON result.

Typical flow ​

Base URL handling ​

Polling behavior ​

Typical flow

Base URL handling

Polling behavior