Skip to content

v1 uses the original background-job flow built around Poma.

MethodUse it when
start_chunk_file(...)You want to submit a file and receive a job ID
get_chunk_result(...)You want to poll for the finished result and optional archive download

Typical flow

python
from poma import Poma

client = Poma()
start_result = client.start_chunk_file("example.pdf")
job_id = start_result.get("job_id")

if not job_id:
    raise RuntimeError("Failed to receive job ID from server.")

result = client.get_chunk_result(job_id, show_progress=True)
chunks = result.get("chunks", [])
chunksets = result.get("chunksets", [])

print(f"Processed {len(chunks)} chunks and {len(chunksets)} chunksets.")

Base URL handling

Pass base_url when the input file contains relative links that should resolve against a known origin:

python
start_result = client.start_chunk_file(
    "page.html",
    base_url="https://docs.example.com",
)
job_id = start_result["job_id"]

Polling behavior

get_chunk_result(...) polls until the job completes and returns plain dictionary data.

If you pass download_dir and filename, it also stores the downloaded .poma archive on disk while returning the parsed JSON result.