Grill API Reference
Reference for every /grill/* endpoint on POMA AI API v3.
- Base URL:
https://api.poma-ai.com/v3 - Auth:
Authorization: Bearer <token>for every endpoint. The token may be a login token, an account API key, or a project API key for a project withproduct: "grill". When you authenticate with account-level credentials, add theX-Project-IDheader to select the target project (protected projects reject account credentials — use a project API key). - Spec: Swagger UI
For higher-level walkthroughs, see Ingestion, Retrieval, RetrievalContext format, and Document management.
Endpoint summary
| Method | Path | Purpose |
|---|---|---|
POST | /grill/ingest | Submit a file for ingestion into the project's Grill namespace (full pipeline). |
POST | /grill/ingestEco | Same as /grill/ingest but runs the cheaper Eco pipeline. |
POST | /grill/search | Hybrid search across the namespace; returns prompt-ready context. |
POST | /grill/searchInDoc | Same as /grill/search but doc_filter is required. |
GET | /grill/docs | List documents in the namespace. |
GET | /grill/docs/{docId} | Get metadata for one document. |
DELETE | /grill/docs/{docId} | Remove a document's vectors and storage. |
The standard /jobs/{job_id}/... endpoints (status, delete) are reused by Grill ingest jobs — they are documented under the main API.
POST /grill/ingest
Create an ingestion job (full pipeline). Either send the file as raw bytes (application/octet-stream, filename carried in Content-Disposition) or point at a public URL with X-Remote-URL and let the server fetch it. Multipart is not accepted.
Request headers
| Header | Required | Notes |
|---|---|---|
Authorization | yes | Bearer <token> — login token, account API key, or project API key (project must have product: "grill"). |
X-Project-ID | conditional | Required only when authenticating with account-level credentials, to select the target project. Omit when using a project API key. |
Content-Type | conditional | application/octet-stream for a raw-bytes upload. Not needed when using X-Remote-URL. |
Content-Disposition | conditional | attachment; filename="<name>.<ext>". The extension drives parser selection. Not required when using X-Remote-URL. |
X-Remote-URL | no | A publicly accessible URL to fetch the file from instead of uploading bytes. When set, Content-Disposition and the request body are optional. |
X-Labels | no | JSON array of categorical string tags for query-time label filtering. Stored as the document's meta_tags, HMAC'd per-tenant (opaque — the vector DB never sees the plaintext), so matching is equality-only. Max 64 tags, ≤ 128 chars each, ≤ 4096 chars total (else 400). e.g. ["year:1982", "source:treasury"]. Filter at search time with meta_tags_any / meta_tags_all. |
X-Meta-Int-1 | no | Customer-defined integer stored plaintext so it can be range-queried. Recommended convention: Unix epoch seconds. Bounded to the JS-safe int range. Filter with meta_int_1_gte / meta_int_1_lte. |
X-Meta-Int-2 | no | Second customer-defined integer (recommended: revision/version number). Plaintext, range-queryable via meta_int_2_gte / meta_int_2_lte. |
X-Unencrypted-Strings | no | JSON object of plaintext key→value metadata for wildcard/glob filtering. Stored unencrypted (vendor-visible — do not put secrets/PII here; use X-Labels for sensitive tags). Max 32 keys; keys [A-Za-z0-9_] (case-insensitive) ≤ 64 chars, values ≤ 1024 chars, ≤ 16384 chars total. e.g. {"project": "acme-merger", "dept": "legal"}. Filter at search time with unencrypted_strings_match. |
X-Base-URL | no | Base URL used to resolve relative image links in the input file. |
X-Completion | no | URL (and optional headers) to send a completion webhook to when the job finishes. |
X-Labels(opaque) vs.X-Unencrypted-Strings(plaintext). Both attach filterable metadata.X-Labelsis HMAC'd — the vector DB never sees the values — so it's safe for sensitive tags, at the cost of equality-only matching (filter viameta_tags_any/meta_tags_all).X-Unencrypted-Stringsis stored in plaintext (vendor-visible) precisely so its values can be wildcard-matched (filter viaunencrypted_strings_match) — never put secrets or PII there. (PrimeCut also acceptsX-Labels.)
Response (201 Created) — PublicJob
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"created_at": "2026-04-30T10:00:00Z",
"properties": {
"file": { "filename": "manual.pdf", "size": 1048576 }
}
}Status codes
| Status | Meaning |
|---|---|
201 | Job created. Poll status with GET /jobs/{job_id}/status or stream via GET /status/v1/jobs/{job_id}. |
400 | Bad request (no X-Remote-URL and missing Content-Disposition, unsupported MIME, or empty body). |
401 | Missing/invalid Bearer token. |
403 | Multipart submitted, or the project's product is primecut. |
500 | Server error. |
See Ingestion for the full lifecycle.
POST /grill/ingestEco
Identical request shape to /grill/ingest — same headers (Content-Disposition / X-Remote-URL, X-Labels, X-Meta-Int-1, X-Meta-Int-2, X-Unencrypted-Strings, X-Base-URL, X-Completion), same application/octet-stream body, same 201 → PublicJob response and status codes. The only difference is that it runs Grill's Eco pipeline — a cheaper, lighter-weight ingest tier. Documents land in the same namespace and are searchable through /grill/search exactly like full-pipeline ingests.
Use /grill/ingestEco for high-volume or cost-sensitive corpora where the full pipeline's extraction depth isn't required; use /grill/ingest when you want maximum extraction quality.
POST /grill/search
Hybrid retrieval. Returns a prompt-ready RetrievalContext.
Request body — GrillSearchRequest
| Field | Type | Required | Description |
|---|---|---|---|
query | string | yes | Natural-language query. |
doc_filter | string | no | Optional single-document id filter. Use /grill/searchInDoc when this is required. |
doc_ids | array of string | no | Restrict search to this set of document ids. |
exclude_doc_ids | array of string | no | Doc ids to exclude from results (max 100). Useful in agent loops to avoid re-citing docs already shown. |
meta_tags_any | array of string | no | Match documents carrying any of these labels (set at ingest via X-Labels). |
meta_tags_all | array of string | no | Match documents carrying all of these labels. |
meta_int_1_gte / meta_int_1_lte | integer | no | Range filter on X-Meta-Int-1. gte = greater-than-or-equal (≥) lower bound, lte = less-than-or-equal (≤) upper bound; both inclusive, either may be sent alone. |
meta_int_2_gte / meta_int_2_lte | integer | no | Same, for X-Meta-Int-2. |
unencrypted_strings_match | object (string→string) | no | Match on plaintext X-Unencrypted-Strings: a key→glob-pattern map. Case-insensitive Unix glob (* = any run, ? = one char; a pattern with no wildcard is an exact match). Multiple keys AND together. |
return_assets | boolean | no | Return cited docs' figures (and tables, where available) in an assets field keyed by doc_id; images are base64 data URIs, referenced as [IMAGE: name] in the context. |
format | string | no | prompt_ready (default) renders the XML+Markdown context block; json returns structured ranked hits instead. |
min_relevance | float 0..1 | no | Relevance floor (default 0.3). Hits below it are dropped. Higher = stricter, lower = more permissive. |
min_top_relevance | float 0..1 | no | Floor for the top hit. If the best hit scores below this, the entire result set comes back empty. |
target_tokens | integer 100..64000 | no | Soft token budget — the typical answer size (default 6000). Hits are admitted best-first up to this. The knob to size a response. |
max_tokens | integer 100..64000 | no | Hard ceiling (default 16000). Grill expands past target_tokens toward it only for tight multi-document clusters; it's raised up to target_tokens if set lower, so it can't shrink a response on its own. |
expand_tightness | float 0..1 | no | How aggressively the engine expands context around each hit. |
retrieval_tier | string | no | Per-query retrieval tier — standard (fusion only) or advanced (adds a reranker). See Retrieval tiers. |
premium | boolean | no | Legacy flag — true is equivalent to retrieval_tier: "advanced". Prefer retrieval_tier. |
Result count is bounded server-side by relevance and the token budget — there is no
top_kparameter.
Response (200 OK) — RetrievalContext (default, format: "prompt_ready")
{
"context": "<context><doc id=\"…\">…</doc></context>",
"query": "How did operating margin change year over year?",
"namespace": "project_docs_rag_4f2",
"result_count": 4,
"tokens_estimated": 5820,
"results_dropped": 2,
"detected_lang": "english",
"mode": "advanced",
"search_units": 1
}context is the prompt-ready block (see RetrievalContext format for the grammar) and the field you'll use 99% of the time; the siblings are metadata — result_count / results_dropped (how many hits made it / were dropped for budget), tokens_estimated (rendered size), detected_lang, mode (retrieval tier used), and search_units (billing). When return_assets: true and a cited doc has figures/tables, an assets object (keyed by doc_id) is also present; images are base64 data URIs referenced as [IMAGE: name] inside context.
With format: "json" the response is a SearchResponse instead: a results array of per-doc hits (doc_id, content, title, canonical_url, pages, score, scores.{ann,bm25,rrf,reranker}, …) plus the same query / namespace / result_count / detected_lang / results_dropped metadata — no rendered context.
Status codes
| Status | Meaning |
|---|---|
200 | OK. |
400 | Validation error or upstream Grill 400. |
401 | Missing/invalid token, or upstream 401. |
403 | Upstream Grill 403. |
404 | doc_filter references an unknown doc. |
502 | Other upstream Grill / proxy error. |
503 | Cannot reach the Grill backend. |
POST /grill/searchInDoc
Same shape as /grill/search, but doc_filter is required and must be non-empty. Use this when you want server-side enforcement that retrieval stays inside one document.
Request body — GrillSearchInDocRequest
| Field | Type | Required | Description |
|---|---|---|---|
query | string | yes | Natural-language query. |
doc_filter | string | yes | Document id to restrict search to. |
return_page_images | boolean | no | Deprecated / not available — no-op today (page screenshots will be served via a dedicated endpoint, not inline). |
It also accepts every optional field from /grill/search — doc_ids, exclude_doc_ids, meta_tags_any / meta_tags_all, the meta_int_* range filters, unencrypted_strings_match, return_assets, format, min_relevance, min_top_relevance, target_tokens, max_tokens, expand_tightness, retrieval_tier, and premium — with the same semantics.
Same as
/grill/search: result count is bounded by relevance and the token budget; there is notop_k.
Response (200 OK) — RetrievalContext (identical to /grill/search).
Status codes
| Status | Meaning |
|---|---|
200 | OK. |
400 | Missing query, missing/empty doc_filter, or upstream 400. |
401 | Missing/invalid token. |
403 | Upstream Grill 403. |
404 | Doc not found in this project's namespace. |
502 | Upstream / proxy error. |
503 | Grill backend unreachable. |
GET /grill/docs
List documents in the project's namespace.
Response (200 OK) — ListDocsResponse
{
"namespace": "project_docs_rag_4f2",
"total_documents": 3,
"documents": [ DocInfo, … ]
}| Field | Type | Description |
|---|---|---|
namespace | string | Project namespace identifier in the Grill backend. |
total_documents | integer | Count of documents in the namespace. |
documents | array | DocInfo entries (see below). |
Status codes
| Status | Meaning |
|---|---|
200 | OK. |
401 | Missing/invalid token. |
403 | Upstream Grill 403. |
404 | Upstream Grill 404. |
502 / 503 | Upstream issues. |
GET /grill/docs/{docId}
Fetch metadata for one document.
Path parameters
| Name | Type | Description |
|---|---|---|
docId | string | The exact doc_id returned by /grill/docs. |
Response (200 OK) — DocInfo
| Field | Type | Description |
|---|---|---|
doc_id | string | Stable identifier within the namespace. |
title | string | Detected title (filename fallback). |
language | string | BCP-47 language code. |
filename | string | Original ingest filename. |
canonical_url | string | Citation URL stamped at ingest by Core (e.g. the resolved target of a short link / redirect when ingesting from a URL). Empty when not applicable. |
pages | integer | Source page count. |
chunkset_count | integer | Chunksets produced. |
chunk_count | integer | Chunks produced. |
image_count | integer | Detected figures/images. |
table_count | integer | Detected tables. |
ingested_at | string (RFC 3339) | Job completion time. |
source_job_id | string | job_id of the originating /grill/ingest. |
bm25_state | string | State of the document's BM25 (lexical) index. |
Status codes
| Status | Meaning |
|---|---|
200 | OK. |
400 | Empty docId. |
401 | Missing/invalid token. |
403 | Upstream Grill 403. |
404 | Doc not found in this namespace. |
502 / 503 | Upstream issues. |
DELETE /grill/docs/{docId}
Remove a document's vectors and storage from the namespace. Project, key, and other documents are unaffected.
Path parameters
| Name | Type | Description |
|---|---|---|
docId | string | Document to delete. |
Response (200 OK) — DeleteDocResponse
| Field | Type | Description |
|---|---|---|
doc_id | string | Echoes the deleted docId. |
vectors_deleted | integer | Vector count removed from the namespace. |
storage_deleted | boolean | Whether stored bytes (assets, page images) were removed. |
Status codes
| Status | Meaning |
|---|---|
200 | OK. |
400 | Empty docId. |
401 | Missing/invalid token. |
403 | Upstream Grill 403. |
404 | Doc not found in this namespace. |
502 / 503 | Upstream issues. |
Schemas (canonical shapes)
These are the same definitions used by the OpenAPI v3 spec. All are JSON.
GrillSearchRequest
{
"query": "string (required)",
"doc_filter": "string", // restrict to one doc
"doc_ids": ["string"], // restrict to a set of docs
"exclude_doc_ids": ["string"], // max 100
"meta_tags_any": ["string"], // match any label (X-Labels)
"meta_tags_all": ["string"], // match all labels
"meta_int_1_gte": 0, "meta_int_1_lte": 0, // X-Meta-Int-1 range
"meta_int_2_gte": 0, "meta_int_2_lte": 0, // X-Meta-Int-2 range
"unencrypted_strings_match": { "path": "legal/contracts/*" }, // glob on X-Unencrypted-Strings
"return_assets": false,
"return_page_images": false, // deprecated, no-op
"format": "prompt_ready", // or "json" for structured hits
"min_relevance": 0.3, // relevance floor, 0..1 (default 0.3)
"min_top_relevance": 0.0, // floor for the top hit; below → empty set
"target_tokens": 6000, // soft budget / answer size (default 6000, 100..64000)
"max_tokens": 16000, // hard ceiling (default 16000, 100..64000)
"expand_tightness": 0.0, // context-expansion aggressiveness, 0..1
"retrieval_tier": "standard", // or "advanced" (reranker)
"premium": false // legacy: true == retrieval_tier "advanced"
}Every field except
queryis optional.min_relevancetunes precision (higher = stricter);target_tokenssizes the answer;max_tokensis a ceiling that only matters for tight multi-document clusters (raised up totarget_tokensif set lower).meta_tags_*,meta_int_*, andunencrypted_strings_matchfilter against metadata set at ingest (all combined via AND).retrieval_tierpicks the reranking tier — see Retrieval tiers.
GrillSearchInDocRequest
Same fields as GrillSearchRequest, except both query and doc_filter are required and doc_filter must be non-empty.
RetrievalContext
{
"context": "string (XML+Markdown — see RetrievalContext format page)",
"query": "string",
"namespace": "string",
"result_count": 0, // hits included in context
"tokens_estimated": 0, // rendered size of context
"results_dropped": 0, // hits dropped for the token budget
"detected_lang": "string",// BM25 query language; null on low confidence
"mode": "string", // retrieval tier actually used
"search_units": 0, // billing units
"assets": { // present only when return_assets=true and assets exist
"<doc_id>": { "images": [], "tables": [] }
}
}
format: "json"instead returns aSearchResponse:{ results: [{ doc_id, content, title, canonical_url, pages, chunk_indices, score, scores: { ann, bm25, rrf, reranker } }], query, namespace, result_count, detected_lang, results_dropped }.
DocInfo
{
"doc_id": "string (required)",
"title": "string",
"language": "string",
"filename": "string",
"canonical_url": "string", // citation URL, set by Core at ingest
"pages": 0,
"chunkset_count": 0,
"chunk_count": 0,
"image_count": 0,
"table_count": 0,
"ingested_at": "RFC 3339 timestamp",
"source_job_id": "uuid",
"bm25_state": "string"
}ListDocsResponse
{
"namespace": "string",
"total_documents": 0,
"documents": [ /* DocInfo */ ]
}DeleteDocResponse
{
"doc_id": "string",
"vectors_deleted": 0,
"storage_deleted": true
}PublicJob (for /grill/ingest)
{
"job_id": "uuid",
"created_at": "RFC 3339 timestamp",
"status": { "job_id": "…", "status": "pending|processing|done|failed", "code": 200 },
"properties": { "file": { "filename": "string", "size": 0 }, "base_url": "string" }
}