New · Read these docs from inside your AI tool
poma-docs MCP — one click, zero copy-paste
On any docs page, hit Copy page in the top-right toolbar and pick a target. Connect to Claude, Cursor, Perplexity, or any MCP-aware agent and have these docs available as context while you build.
Behind each Copy page button you find:
Welcome to POMA AI
POMA AI is a context engine for retrieval-augmented generation (RAG). Hand us a document; we hand you prompt-ready context for any large language model. No chunking strategy to design, no vector store to run, no retrieval glue to maintain — unless you want to own those layers yourself.
This page is the high-level orientation: which product to pick, what the POMA Console does for you, and where to go next.
Two products, one ingestion pipeline
Both products share the same hierarchical chunking engine. They differ only in what you get back and who runs retrieval.
Grill — managed context engine
Easiest path · single API call
You ingest a document, then ask Grill questions in natural language. Grill returns a prompt-ready context block (XML + Markdown) you can drop straight into an LLM prompt. No vector store to run, no embedding model to choose, no reranker to tune. POMA handles everything from chunking through hybrid retrieval and token budgeting.
Pick Grill when…
- You want a managed RAG endpoint in a single HTTP call.
- You don't want to run your own retrieval stack.
- You're building agent-driven document Q&A and want the LLM to see clean, structured context.
PrimeCut — RAG ingestion engine
Full control · own your retrieval stack
You ingest a document; we hand back a .poma archive of typed chunks and chunksets (root-to-leaf paths through the document's hierarchy). Embed them in your own vector store, run your own retrieval, prompt your own way. POMA's job ends at "structurally perfect chunks." Everything after that is yours.
Pick PrimeCut when…
- You already own a retrieval stack (Qdrant, pgvector, LangChain, LlamaIndex, …).
- You need raw chunks for on-prem or air-gapped processing.
- You want POMA's hierarchical chunking but your own retrieval logic.
Side by side
| Grill | PrimeCut | |
|---|---|---|
| What POMA returns | A RetrievalContext block — XML + Markdown, prompt-ready | A .poma archive — typed chunks, chunksets, images, metadata |
| Who runs retrieval | POMA (hybrid search, sandwich ordering, token budgeting) | You (Qdrant, LangChain, LlamaIndex, your own…) |
| Vector store needed | No | Yes |
| Surface | REST · SDK · MCP · Hosted MCP endpoint | REST · SDK · MCP · CLI |
| API key prefix | poma_prod_gr_… (per project) | poma_acc_… (per account) |
A single account can use both — they don't share a key, but they share a billing relationship.
POMA Console
The POMA Console is where you manage everything around the products, without touching code. Sign in with the same credentials you'd use against the API and you can:
- Manage organisations — create orgs, invite teammates, assign roles (
owner,admin,member), revoke access. - Manage projects — every Grill workload lives in a project (the namespace that owns its ingested documents and vectors). Create projects, set their product (
grillorprimecut), inspect quota and usage. - Manage API keys — generate account-level keys for PrimeCut and project-level keys for Grill, rotate them, revoke leaked ones.
- Run ingestions interactively — drag a file into the Console to ingest it without writing code. Useful for spot-checks, demos, and one-off cleanups.
- Inspect billing — page counts, quota, plan, invoices.
In short: the Console is the admin surface; the docs you're reading now are the build surface. Most teams use both — operators live in the Console, builders live here.
Where to go next
| If you want to… | Start here |
|---|---|
| Ship a managed RAG endpoint in 10 minutes | Grill quickstart |
| Run your own retrieval stack with POMA chunks | PrimeCut overview |
| Understand the architecture before you build | Concepts / Overview |
| Drive POMA from Claude, Cursor, ChatGPT | MCP servers |
| Drive POMA from the shell | CLI |
| Read the long-form chunking guides | Guides |