Skip to content
python
from poma.integrations.qdrant import PomaQdrant

PomaQdrant subclasses QdrantClient and adds POMA-specific convenience methods.

Constructor highlights

Use the normal QdrantClient connection arguments plus the POMA-specific settings:

ArgumentPurpose
collection_nameDefault collection name
dense_modelDense embedding model name
sparse_modelSparse model name, or None to disable sparse retrieval
dense_nameDense vector field name
sparse_nameSparse vector field name
dense_optionsExtra options passed to dense models.Document(...)
sparse_optionsExtra options passed to sparse models.Document(...)
dense_sizeDense vector size
distanceVector distance metric
store_chunk_detailsStore chunk details in payload
auto_create_collectionCreate the collection automatically

If auto_create_collection=True, set dense_size.

The default sparse model is "Qdrant/bm25", so PomaQdrant uses hybrid dense+sparse retrieval unless you explicitly set sparse_model=None.

Methods

upsert_poma_points

python
upsert_poma_points(
    chunk_data,
    *,
    collection_name: str | None = None,
    dense_model: str | None = None,
    sparse_model = _UNSET,
    dense_name: str | None = None,
    sparse_name: str | None = None,
    dense_options: dict | None = None,
    sparse_options: dict | None = None,
    store_chunk_details: bool | None = None,
    poma_batch_size: int | None = 100,
    batch_size: int | None = 100,
    **kwargs,
)

Accepts typed PomaResult, legacy chunk-data dictionaries, or a .poma archive path.

Behavior notes:

  • If both poma_batch_size and batch_size are None, the method uses upsert(...).
  • Otherwise it uses upload_points(...) and enables wait=True by default when not provided.
  • ordering is rejected when batching is enabled.
  • kwargs override the convenience arguments when both specify the same setting.

get_cheatsheets

python
get_cheatsheets(
    *,
    query: str | None = None,
    results = None,
    query_obj = None,
    prefetch = None,
    using: str | None = None,
    collection_name: str | None = None,
    chunk_data = None,
    limit: int = 5,
    query_filter = None,
    with_vectors: bool | list[str] = False,
    dense_model: str | None = None,
    sparse_model = _UNSET,
    dense_name: str | None = None,
    sparse_name: str | None = None,
    dense_options: dict | None = None,
    sparse_options: dict | None = None,
    prefetch_limit: int | None = None,
    **kwargs,
) -> list[dict]

Use query for the convenience path. Use query_obj and prefetch when you want raw Qdrant query control.

Behavior notes:

  • Exactly one mode must be used: results, query_obj, or query.
  • When query is used and sparse_model is enabled, the method runs hybrid RRF search with dense and sparse prefetch queries.
  • When query is used and sparse_model=None, the method runs dense-only retrieval.
  • chunk_data can be a dict, PomaResult, or .poma path. When provided, its chunks are used to build cheatsheets instead of payload chunk_details.

Advanced helper functions

Import the lower-level helpers directly when you want to work with Qdrant points or results without using PomaQdrant state:

python
from poma.integrations.qdrant.qdrant_poma_utils import (
    cheatsheets_from_results,
    chunk_uuid_string,
    ensure_collection,
    points_from_chunk_data,
    prepare_points_from_chunk_data,
    results_to_cheatsheet_inputs,
)

chunk_uuid_string

python
chunk_uuid_string(file_id: str, chunkset_index: int) -> str

Build a deterministic UUID for one (file_id, chunkset_index) pair.

prepare_points_from_chunk_data

python
prepare_points_from_chunk_data(
    chunk_data,
    *,
    store_chunk_details: bool = True,
) -> tuple[list[str], list[str], list[dict]]

Return (ids, documents, payloads) before any embedding model wrapping.

points_from_chunk_data

python
points_from_chunk_data(
    chunk_data,
    *,
    dense_model: str,
    sparse_model: str | None = None,
    dense_name: str = "dense",
    sparse_name: str = "sparse",
    dense_options: dict | None = None,
    sparse_options: dict | None = None,
    store_chunk_details: bool = True,
) -> list[qmodels.PointStruct]

Turn chunk data into Qdrant PointStruct values with dense and optional sparse models.Document(...) vectors.

results_to_cheatsheet_inputs

python
results_to_cheatsheet_inputs(results) -> tuple[list[dict], list[dict]]

Extract relevant_chunksets and all_chunks from Qdrant hits or a QueryResponse.

cheatsheets_from_results

python
cheatsheets_from_results(
    results,
    *,
    chunk_data = None,
) -> list[dict]

Convert Qdrant hits directly into POMA cheatsheets.

ensure_collection

python
ensure_collection(
    client,
    name: str,
    *,
    dense: tuple[str, int] | dict | None = None,
    sparse: str | None = None,
    distance: qmodels.Distance = qmodels.Distance.COSINE,
) -> None

Create a collection if it does not exist yet. If it already exists, the helper reuses it and emits a warning with the current vector configuration when possible.