Grill API Reference

Reference for every /grill/* endpoint on POMA AI API v3.

Base URL: https://api.poma-ai.com/v3
Auth: Authorization: Bearer <token> for every endpoint. The token may be a login token, an account API key, or a project API key for a project with product: "grill". When you authenticate with account-level credentials, add the X-Project-ID header to select the target project (protected projects reject account credentials — use a project API key).
Spec: Swagger UI

For higher-level walkthroughs, see Ingestion, Retrieval, RetrievalContext format, and Document management.

Endpoint summary

Method	Path	Purpose
`POST`	`/grill/ingest`	Submit a file for ingestion into the project's Grill namespace (full pipeline).
`POST`	`/grill/ingestEco`	Same as `/grill/ingest` but runs the cheaper Eco pipeline.
`POST`	`/grill/search`	Hybrid search across the namespace; returns prompt-ready context.
`POST`	`/grill/searchInDoc`	Same as `/grill/search` but `doc_filter` is required.
`GET`	`/grill/docs`	List documents in the namespace.
`GET`	`/grill/docs/{docId}`	Get metadata for one document.
`DELETE`	`/grill/docs/{docId}`	Remove a document's vectors and storage.

The standard /jobs/{job_id}/... endpoints (status, delete) are reused by Grill ingest jobs — they are documented under the main API.

`POST /grill/ingest`

Create an ingestion job (full pipeline). Either send the file as raw bytes (application/octet-stream, filename carried in Content-Disposition) or point at a public URL with X-Remote-URL and let the server fetch it. Multipart is not accepted.

Request headers

Header	Required	Notes
`Authorization`	yes	`Bearer <token>` — login token, account API key, or project API key (project must have `product: "grill"`).
`X-Project-ID`	conditional	Required only when authenticating with account-level credentials, to select the target project. Omit when using a project API key.
`Content-Type`	conditional	`application/octet-stream` for a raw-bytes upload. Not needed when using `X-Remote-URL`.
`Content-Disposition`	conditional	`attachment; filename="<name>.<ext>"`. The extension drives parser selection. Not required when using `X-Remote-URL`.
`X-Remote-URL`	no	A publicly accessible URL to fetch the file from instead of uploading bytes. When set, `Content-Disposition` and the request body are optional.
`X-Labels`	no	JSON array of categorical string tags for query-time label filtering. Stored as the document's `meta_tags`, HMAC'd per-tenant (opaque — the vector DB never sees the plaintext), so matching is equality-only. Max 64 tags, ≤ 128 chars each, ≤ 4096 chars total (else `400`). e.g. `["year:1982", "source:treasury"]`. Filter at search time with `meta_tags_any` / `meta_tags_all`.
`X-Meta-Int-1`	no	Customer-defined integer stored plaintext so it can be range-queried. Recommended convention: Unix epoch seconds. Bounded to the JS-safe int range. Filter with `meta_int_1_gte` / `meta_int_1_lte`.
`X-Meta-Int-2`	no	Second customer-defined integer (recommended: revision/version number). Plaintext, range-queryable via `meta_int_2_gte` / `meta_int_2_lte`.
`X-Unencrypted-Strings`	no	JSON object of plaintext key→value metadata for wildcard/glob filtering. Stored unencrypted (vendor-visible — do not put secrets/PII here; use `X-Labels` for sensitive tags). Max 32 keys; keys `[A-Za-z0-9_]` (case-insensitive) ≤ 64 chars, values ≤ 1024 chars, ≤ 16384 chars total. e.g. `{"project": "acme-merger", "dept": "legal"}`. Filter at search time with `unencrypted_strings_match`.
`X-Base-URL`	no	Base URL used to resolve relative image links in the input file.
`X-Completion`	no	URL (and optional headers) to send a completion webhook to when the job finishes.

X-Labels (opaque) vs. X-Unencrypted-Strings (plaintext). Both attach filterable metadata. X-Labels is HMAC'd — the vector DB never sees the values — so it's safe for sensitive tags, at the cost of equality-only matching (filter via meta_tags_any / meta_tags_all). X-Unencrypted-Strings is stored in plaintext (vendor-visible) precisely so its values can be wildcard-matched (filter via unencrypted_strings_match) — never put secrets or PII there. (PrimeCut also accepts X-Labels.)

Response (201 Created) — PublicJob

json

{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "created_at": "2026-04-30T10:00:00Z",
  "properties": {
    "file": { "filename": "manual.pdf", "size": 1048576 }
  }
}

Status codes

Status	Meaning
`201`	Job created. Poll status with `GET /jobs/{job_id}/status` or stream via `GET /status/v1/jobs/{job_id}`.
`400`	Bad request (no `X-Remote-URL` and missing `Content-Disposition`, unsupported MIME, or empty body).
`401`	Missing/invalid Bearer token.
`403`	Multipart submitted, or the project's product is `primecut`.
`500`	Server error.

See Ingestion for the full lifecycle.

`POST /grill/ingestEco`

Identical request shape to /grill/ingest — same headers (Content-Disposition / X-Remote-URL, X-Labels, X-Meta-Int-1, X-Meta-Int-2, X-Unencrypted-Strings, X-Base-URL, X-Completion), same application/octet-stream body, same 201 → PublicJob response and status codes. The only difference is that it runs Grill's Eco pipeline — a cheaper, lighter-weight ingest tier. Documents land in the same namespace and are searchable through /grill/search exactly like full-pipeline ingests.

Use /grill/ingestEco for high-volume or cost-sensitive corpora where the full pipeline's extraction depth isn't required; use /grill/ingest when you want maximum extraction quality.

`POST /grill/search`

Hybrid retrieval. Returns a prompt-ready RetrievalContext.

Request body — GrillSearchRequest

Field	Type	Required	Description
`query`	string	yes	Natural-language query.
`doc_filter`	string	no	Optional single-document id filter. Use `/grill/searchInDoc` when this is required.
`doc_ids`	array of string	no	Restrict search to this set of document ids.
`exclude_doc_ids`	array of string	no	Doc ids to exclude from results (max 100). Useful in agent loops to avoid re-citing docs already shown.
`meta_tags_any`	array of string	no	Match documents carrying any of these labels (set at ingest via `X-Labels`).
`meta_tags_all`	array of string	no	Match documents carrying all of these labels.
`meta_int_1_gte` / `meta_int_1_lte`	integer	no	Range filter on `X-Meta-Int-1`. `gte` = greater-than-or-equal (≥) lower bound, `lte` = less-than-or-equal (≤) upper bound; both inclusive, either may be sent alone.
`meta_int_2_gte` / `meta_int_2_lte`	integer	no	Same, for `X-Meta-Int-2`.
`unencrypted_strings_match`	object (string→string)	no	Match on plaintext `X-Unencrypted-Strings`: a key→glob-pattern map. Case-insensitive Unix glob (`*` = any run, `?` = one char; a pattern with no wildcard is an exact match). Multiple keys AND together.
`return_assets`	boolean	no	Return cited docs' figures (and tables, where available) in an `assets` field keyed by `doc_id`; images are base64 data URIs, referenced as `[IMAGE: name]` in the context.
`format`	string	no	`prompt_ready` (default) renders the XML+Markdown context block; `json` returns structured ranked hits instead.
`min_relevance`	float `0..1`	no	Relevance floor (default `0.3`). Hits below it are dropped. Higher = stricter, lower = more permissive.
`min_top_relevance`	float `0..1`	no	Floor for the top hit. If the best hit scores below this, the entire result set comes back empty.
`target_tokens`	integer `100..500000`	no	Soft token budget — the typical answer size (default `5000`). Hits are admitted best-first up to this. The knob to size a response. Can never exceed `max_tokens` (clamped down if it would).
`max_tokens`	integer `100..500000`	no	Hard ceiling (default `15000`). Grill expands past `target_tokens` toward it only for tight multi-document clusters. Set it below `target_tokens` and the soft target is clamped down to match — the hard ceiling always wins; it never inflates a response on its own.
`expand_tightness`	float `0..1`	no	How aggressively the engine expands context around each hit.
`retrieval_tier`	string	no	Per-query retrieval tier — `standard` (fusion only) or `advanced` (adds a reranker). See Retrieval tiers.
`premium`	boolean	no	Legacy flag — `true` is equivalent to `retrieval_tier: "advanced"`. Prefer `retrieval_tier`.

Result count is bounded server-side by relevance and the token budget — there is no top_k parameter.

Response (200 OK) — RetrievalContext (default, format: "prompt_ready")

json

{
  "context": "<context><doc id=\"…\">…</doc></context>",
  "query": "How did operating margin change year over year?",
  "namespace": "project_docs_rag_4f2",
  "result_count": 4,
  "tokens_estimated": 5820,
  "results_dropped": 2,
  "detected_lang": "english",
  "mode": "advanced",
  "search_units": 1
}

context is the prompt-ready block (see RetrievalContext format for the grammar) and the field you'll use 99% of the time; the siblings are metadata — result_count / results_dropped (how many hits made it / were dropped for budget), tokens_estimated (rendered size), detected_lang, mode (retrieval tier used), and search_units (billing). When return_assets: true and a cited doc has figures/tables, an assets object (keyed by doc_id) is also present; images are base64 data URIs referenced as [IMAGE: name] inside context.

With format: "json" the response is a SearchResponse instead: a results array of per-doc hits (doc_id, content, title, canonical_url, pages, score, scores.{ann,bm25,rrf,reranker}, …) plus the same query / namespace / result_count / detected_lang / results_dropped metadata — no rendered context.

Status codes

Status	Meaning
`200`	OK.
`400`	Validation error or upstream Grill `400`.
`401`	Missing/invalid token, or upstream `401`.
`403`	Upstream Grill `403`.
`404`	`doc_filter` references an unknown doc.
`502`	Other upstream Grill / proxy error.
`503`	Cannot reach the Grill backend.

`POST /grill/searchInDoc`

Same shape as /grill/search, but doc_filter is required and must be non-empty. Use this when you want server-side enforcement that retrieval stays inside one document.

Request body — GrillSearchInDocRequest

Field	Type	Required	Description
`query`	string	yes	Natural-language query.
`doc_filter`	string	yes	Document id to restrict search to.
`return_page_images`	boolean	no	Deprecated / not available — no-op today (page screenshots will be served via a dedicated endpoint, not inline).

It also accepts every optional field from /grill/search — doc_ids, exclude_doc_ids, meta_tags_any / meta_tags_all, the meta_int_* range filters, unencrypted_strings_match, return_assets, format, min_relevance, min_top_relevance, target_tokens, max_tokens, expand_tightness, retrieval_tier, and premium — with the same semantics.

Same as /grill/search: result count is bounded by relevance and the token budget; there is no top_k.

Response (200 OK) — RetrievalContext (identical to /grill/search).

Status codes

Status	Meaning
`200`	OK.
`400`	Missing `query`, missing/empty `doc_filter`, or upstream `400`.
`401`	Missing/invalid token.
`403`	Upstream Grill `403`.
`404`	Doc not found in this project's namespace.
`502`	Upstream / proxy error.
`503`	Grill backend unreachable.

`GET /grill/docs`

List documents in the project's namespace.

Response (200 OK) — ListDocsResponse

json

{
  "namespace": "project_docs_rag_4f2",
  "total_documents": 3,
  "documents": [ DocInfo, … ]
}

Field	Type	Description
`namespace`	string	Project namespace identifier in the Grill backend.
`total_documents`	integer	Count of documents in the namespace.
`documents`	array	`DocInfo` entries (see below).

Status codes

Status	Meaning
`200`	OK.
`401`	Missing/invalid token.
`403`	Upstream Grill `403`.
`404`	Upstream Grill `404`.
`502` / `503`	Upstream issues.

`GET /grill/docs/{docId}`

Fetch metadata for one document.

Path parameters

Name	Type	Description
`docId`	string	The exact `doc_id` returned by `/grill/docs`.

Response (200 OK) — DocInfo

Field	Type	Description
`doc_id`	string	Stable identifier within the namespace.
`title`	string	Detected title (filename fallback).
`language`	string	BCP-47 language code.
`filename`	string	Original ingest filename.
`canonical_url`	string	Citation URL stamped at ingest by Core (e.g. the resolved target of a short link / redirect when ingesting from a URL). Empty when not applicable.
`pages`	integer	Source page count.
`chunkset_count`	integer	Chunksets produced.
`chunk_count`	integer	Chunks produced.
`image_count`	integer	Detected figures/images.
`table_count`	integer	Detected tables.
`ingested_at`	string (RFC 3339)	Job completion time.
`source_job_id`	string	`job_id` of the originating `/grill/ingest`.
`bm25_state`	string	State of the document's BM25 (lexical) index.

Status codes

Status	Meaning
`200`	OK.
`400`	Empty `docId`.
`401`	Missing/invalid token.
`403`	Upstream Grill `403`.
`404`	Doc not found in this namespace.
`502` / `503`	Upstream issues.

`DELETE /grill/docs/{docId}`

Remove a document's vectors and storage from the namespace. Project, key, and other documents are unaffected.

Path parameters

Name	Type	Description
`docId`	string	Document to delete.

Response (200 OK) — DeleteDocResponse

Field	Type	Description
`doc_id`	string	Echoes the deleted `docId`.
`vectors_deleted`	integer	Vector count removed from the namespace.
`storage_deleted`	boolean	Whether stored bytes (assets, page images) were removed.

Status codes

Status	Meaning
`200`	OK.
`400`	Empty `docId`.
`401`	Missing/invalid token.
`403`	Upstream Grill `403`.
`404`	Doc not found in this namespace.
`502` / `503`	Upstream issues.

Schemas (canonical shapes)

These are the same definitions used by the OpenAPI v3 spec. All are JSON.

`GrillSearchRequest`

jsonc

{
  "query": "string (required)",
  "doc_filter": "string",         // restrict to one doc
  "doc_ids": ["string"],          // restrict to a set of docs
  "exclude_doc_ids": ["string"],  // max 100
  "meta_tags_any": ["string"],       // match any label (X-Labels)
  "meta_tags_all": ["string"],       // match all labels
  "meta_int_1_gte": 0, "meta_int_1_lte": 0,  // X-Meta-Int-1 range
  "meta_int_2_gte": 0, "meta_int_2_lte": 0,  // X-Meta-Int-2 range
  "unencrypted_strings_match": { "path": "legal/contracts/*" }, // glob on X-Unencrypted-Strings
  "return_assets": false,
  "return_page_images": false,    // deprecated, no-op
  "format": "prompt_ready",       // or "json" for structured hits
  "min_relevance": 0.3,           // relevance floor, 0..1 (default 0.3)
  "min_top_relevance": 0.0,       // floor for the top hit; below → empty set
  "target_tokens": 5000,          // soft budget / answer size (default 5000, 100..500000; ≤ max_tokens)
  "max_tokens": 15000,            // hard ceiling (default 15000, 100..500000)
  "expand_tightness": 0.0,        // context-expansion aggressiveness, 0..1
  "retrieval_tier": "standard",   // or "advanced" (reranker)
  "premium": false                // legacy: true == retrieval_tier "advanced"
}

Every field except query is optional. min_relevance tunes precision (higher = stricter); target_tokens sizes the answer; max_tokens is a ceiling that only matters for tight multi-document clusters (set it below target_tokens and the soft target is clamped down to match — the hard ceiling always wins). meta_tags_*, meta_int_*, and unencrypted_strings_match filter against metadata set at ingest (all combined via AND). retrieval_tier picks the reranking tier — see Retrieval tiers.

`GrillSearchInDocRequest`

Same fields as GrillSearchRequest, except both query and doc_filter are required and doc_filter must be non-empty.

`RetrievalContext`

jsonc

{
  "context": "string (XML+Markdown — see RetrievalContext format page)",
  "query": "string",
  "namespace": "string",
  "result_count": 0,        // hits included in context
  "tokens_estimated": 0,    // rendered size of context
  "results_dropped": 0,     // hits dropped for the token budget
  "detected_lang": "string",// BM25 query language; null on low confidence
  "mode": "string",         // retrieval tier actually used
  "search_units": 0,        // billing units
  "assets": {               // present only when return_assets=true and assets exist
    "<doc_id>": { "images": [], "tables": [] }
  }
}

format: "json" instead returns a SearchResponse: { results: [{ doc_id, content, title, canonical_url, pages, chunk_indices, score, scores: { ann, bm25, rrf, reranker } }], query, namespace, result_count, detected_lang, results_dropped }.

`DocInfo`

jsonc

{
  "doc_id": "string (required)",
  "title": "string",
  "language": "string",
  "filename": "string",
  "canonical_url": "string",   // citation URL, set by Core at ingest
  "pages": 0,
  "chunkset_count": 0,
  "chunk_count": 0,
  "image_count": 0,
  "table_count": 0,
  "ingested_at": "RFC 3339 timestamp",
  "source_job_id": "uuid",
  "bm25_state": "string"
}

`ListDocsResponse`

jsonc

{
  "namespace": "string",
  "total_documents": 0,
  "documents": [ /* DocInfo */ ]
}

`DeleteDocResponse`

jsonc

{
  "doc_id": "string",
  "vectors_deleted": 0,
  "storage_deleted": true
}

`PublicJob` (for `/grill/ingest`)

jsonc

{
  "job_id": "uuid",
  "created_at": "RFC 3339 timestamp",
  "status": { "job_id": "…", "status": "pending|processing|done|failed", "code": 200 },
  "properties": { "file": { "filename": "string", "size": 0 }, "base_url": "string" }
}

Grill

Getting started

Concepts

Reference

PrimeCut

Getting started

Concepts

Reference

Python SDK

Getting started

Concepts

Reference

Integrations

Migration

CLI

MCP

Learn (study path)

Chunking

Ingestion

Grill API Reference

Endpoint summary

`POST /grill/ingest`

`POST /grill/ingestEco`

`POST /grill/search`

`POST /grill/searchInDoc`

`GET /grill/docs`

`GET /grill/docs/{docId}`

`DELETE /grill/docs/{docId}`

Schemas (canonical shapes)

`GrillSearchRequest`

`GrillSearchInDocRequest`

`RetrievalContext`

`DocInfo`

`ListDocsResponse`

`DeleteDocResponse`

`PublicJob` (for `/grill/ingest`)

See also

Chunking

Ingestion

Grill API Reference ​

Endpoint summary ​

POST /grill/ingest ​

POST /grill/ingestEco ​

POST /grill/search ​

POST /grill/searchInDoc ​

GET /grill/docs ​

GET /grill/docs/{docId} ​

DELETE /grill/docs/{docId} ​

Schemas (canonical shapes) ​

GrillSearchRequest ​

GrillSearchInDocRequest ​

RetrievalContext ​

DocInfo ​

ListDocsResponse ​

DeleteDocResponse ​

PublicJob (for /grill/ingest) ​

See also ​

Grill API Reference

Endpoint summary

`POST /grill/ingest`

`POST /grill/ingestEco`

`POST /grill/search`

`POST /grill/searchInDoc`

`GET /grill/docs`

`GET /grill/docs/{docId}`

`DELETE /grill/docs/{docId}`

Schemas (canonical shapes)

`GrillSearchRequest`

`GrillSearchInDocRequest`

`RetrievalContext`

`DocInfo`

`ListDocsResponse`

`DeleteDocResponse`

`PublicJob` (for `/grill/ingest`)

See also