Skip to content

New · Read these docs from inside your AI tool

poma-docs MCP — one click, zero copy-paste

On any docs page, hit Copy page in the top-right toolbar and pick a target. Connect to Claude, Cursor, Perplexity, or any MCP-aware agent and have these docs available as context while you build.

Behind each Copy page button you find:

Welcome to POMA AI

POMA AI is a context engine for retrieval-augmented generation (RAG). Hand us a document; we hand you prompt-ready context for any large language model. No chunking strategy to design, no vector store to run, no retrieval glue to maintain — unless you want to own those layers yourself.

This page is the high-level orientation: which product to pick, what the POMA Console does for you, and where to go next.

Two products, one ingestion pipeline

Both products share the same hierarchical chunking engine. They differ only in what you get back and who runs retrieval.

Grill — managed context engine

Easiest path · single API call

You ingest a document, then ask Grill questions in natural language. Grill returns a prompt-ready context block (XML + Markdown) you can drop straight into an LLM prompt. No vector store to run, no embedding model to choose, no reranker to tune. POMA handles everything from chunking through hybrid retrieval and token budgeting.

Pick Grill when…

  • You want a managed RAG endpoint in a single HTTP call.
  • You don't want to run your own retrieval stack.
  • You're building agent-driven document Q&A and want the LLM to see clean, structured context.

Open Grill docs → · Quickstart

PrimeCut — RAG ingestion engine

Full control · own your retrieval stack

You ingest a document; we hand back a .poma archive of typed chunks and chunksets (root-to-leaf paths through the document's hierarchy). Embed them in your own vector store, run your own retrieval, prompt your own way. POMA's job ends at "structurally perfect chunks." Everything after that is yours.

Pick PrimeCut when…

  • You already own a retrieval stack (Qdrant, pgvector, LangChain, LlamaIndex, …).
  • You need raw chunks for on-prem or air-gapped processing.
  • You want POMA's hierarchical chunking but your own retrieval logic.

Open PrimeCut docs → · Quickstart

Side by side

GrillPrimeCut
What POMA returnsA RetrievalContext block — XML + Markdown, prompt-readyA .poma archive — typed chunks, chunksets, images, metadata
Who runs retrievalPOMA (hybrid search, sandwich ordering, token budgeting)You (Qdrant, LangChain, LlamaIndex, your own…)
Vector store neededNoYes
SurfaceREST · SDK · MCP · Hosted MCP endpointREST · SDK · MCP · CLI
API key prefixpoma_prod_gr_… (per project)poma_acc_… (per account)

A single account can use both — they don't share a key, but they share a billing relationship.

POMA Console

The POMA Console is where you manage everything around the products, without touching code. Sign in with the same credentials you'd use against the API and you can:

  • Manage organisations — create orgs, invite teammates, assign roles (owner, admin, member), revoke access.
  • Manage projects — every Grill workload lives in a project (the namespace that owns its ingested documents and vectors). Create projects, set their product (grill or primecut), inspect quota and usage.
  • Manage API keys — generate account-level keys for PrimeCut and project-level keys for Grill, rotate them, revoke leaked ones.
  • Run ingestions interactively — drag a file into the Console to ingest it without writing code. Useful for spot-checks, demos, and one-off cleanups.
  • Inspect billing — page counts, quota, plan, invoices.

In short: the Console is the admin surface; the docs you're reading now are the build surface. Most teams use both — operators live in the Console, builders live here.

Open the POMA Console →

Where to go next

If you want to…Start here
Ship a managed RAG endpoint in 10 minutesGrill quickstart
Run your own retrieval stack with POMA chunksPrimeCut overview
Understand the architecture before you buildConcepts / Overview
Drive POMA from Claude, Cursor, ChatGPTMCP servers
Drive POMA from the shellCLI
Read the long-form chunking guidesGuides