Skip to content

Grill Ingestion

Grill's ingestion entry point is POST /grill/ingest. Under the hood it runs the same PrimeCut pipeline you would use for raw chunks — parsing, chunking, embedding — but it persists the result inside your project's Grill namespace instead of returning a .poma archive for you to download.

Why a separate endpoint

POST /grill/ingest and POST /primeCut/ingest look almost identical at the wire level. They differ in what happens after the chunks are produced:

Step/primeCut/ingest/grill/ingest
Parse + chunk✅ same pipeline✅ same pipeline
EmbedOptional, depends on plan✅ always
Persist to project namespace (vectors + storage)
Make doc available to /grill/search
.poma archive download via /jobs/{job_id}/download

If you want chunks to take home, use PrimeCut. If you want the document to be searchable through /grill/search, use Grill.

One project = one product. A project created with product:"primecut" cannot call /grill/ingest, and vice versa. See Create a Grill project.

Wire format

Request body is raw file bytes as application/octet-stream. The filename rides in Content-Disposition:

http
POST /v3/grill/ingest HTTP/1.1
Host: api.poma-ai.com
Authorization: Bearer <project-api-key>
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="manual.pdf"

<raw bytes>

Response (201 Created) is a PublicJob:

json
{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "created_at": "2026-04-30T10:00:00Z",
  "properties": {
    "file": { "filename": "manual.pdf", "size": 1048576 }
  }
}

Multipart (multipart/form-data) is not supported for Grill — it returns 403. Use octet-stream.

Supported file types

Grill inherits the full PrimeCut format set:

  • Documents: pdf, doc, docx, dotx, rtf, txt, md, html, htm, xml
  • Presentations: ppt, pptx, pps, ppsx, pot, potx, key
  • Spreadsheets: xls, xlsx, xlsb, xltx, csv, numbers, ods, odc
  • Images: png, jpg, jpeg, gif, bmp, tif, tiff, svg, webp, ico, heic, heif, psd
  • Other: epub, mobi, djvu, dwg, dxf, dwf, dwfx, vsd, vsdx, ai, eps, ps, prn, xps, oxps, pub, mdi, pages, odp, odf, odt

Async lifecycle

Like every POMA job, Grill ingest is asynchronous. The job_id returned by /grill/ingest plugs into the standard status machinery:

text
pending  ──▶ processing  ──▶ done       ◀── searchable from this point
                          └─▶ failed    ◀── error.detail in /status response
Use caseEndpoint
One-shot pollingGET /jobs/{job_id}/status
Live updatesGET /status/v1/jobs/{job_id} (SSE)
Cancel / cleanupDELETE /jobs/{job_id} (best-effort)

The job's download link is not populated for Grill jobs — there is no .poma to fetch. Grill stores the artifacts inside its own namespace.

What happens at status: done

When the job transitions to done, the following are true atomically:

  1. The document appears in GET /grill/docs for this project.
  2. GET /grill/docs/{docId} returns its DocInfo (chunk counts, page count, ingest timestamp, source job id).
  3. POST /grill/search and POST /grill/searchInDoc can retrieve passages from it.

docId is the document identifier Grill assigns at ingest. It is derived from the filename (sanitised) plus a project-scoped salt; you find the canonical value in DocInfo.doc_id after the job finishes. Use that exact string for doc_filter and for /grill/docs/{docId} lookups.

Re-ingesting the same file

Re-uploading a file with the same effective docId replaces the existing document — old vectors and storage are discarded. There is no append mode today; an updated PDF fully supersedes the previous version. If you need version history, ingest each version under a distinct filename so the docId differs.

If you want to remove a doc cleanly before re-ingest, use DELETE /grill/docs/{docId} (Document management).

Errors you will see

StatusWhenWhat to do
400Missing/invalid Content-Disposition, unsupported MIME, empty bodyFix the headers; check the file is non-empty.
401Missing or invalid Bearer tokenUse a project API key — see Authentication.
403Caller's project is primecut, or the request is multipartCreate a Grill project; switch to octet-stream.
500Server-side parse failureRetry once; if it persists, contact support with the job_id.

Practical patterns

Bulk ingest a folder.

bash
for f in corpus/*.pdf; do
  curl -sS -X POST "$GRILL/grill/ingest" \
    -H "authorization: Bearer $GRILL_KEY" \
    -H "content-type: application/octet-stream" \
    -H "content-disposition: attachment; filename=\"$(basename "$f")\"" \
    --data-binary "@$f" \
  | jq -c '{file: "'"$f"'", job: .job_id}'
done

Wait for jobs to reach done before issuing search calls — search will return 404 (or simply miss the document) for content that has not finished indexing.

Ingest from the Python SDK.

The recommended path is the SDK's Grill client, which handles the octet-stream framing, polling, and status streaming for you. It reads the project API key from POMA_GRILL_API_KEY.

python
# pip install poma
from poma import Grill

g = Grill()
result = g.ingest("manual.pdf")        # submit + poll + return when done
print(result.job_id, result.status, result.usage)

For batch ingestion, submit all jobs first and collect later — useful when you want submissions to happen quickly and the long wait to happen in parallel:

python
from poma import Grill

g = Grill()
job_ids = [g.submit(p) for p in ["a.pdf", "b.pdf", "c.pdf"]]
results = [g.collect(jid) for jid in job_ids]

Or, in async code, run the waits concurrently with AsyncGrill:

python
import asyncio
from poma import AsyncGrill

async def ingest_all(paths: list[str]) -> None:
    async with AsyncGrill() as g:
        results = await asyncio.gather(*(g.ingest(p) for p in paths))
        for r in results:
            print(r.job_id, r.status)

asyncio.run(ingest_all(["a.pdf", "b.pdf", "c.pdf"]))

Full method signatures: Grill reference, AsyncGrill reference.

Next