Skip to content

Multimodal orchestration for LLM APIs

You describe what to analyze. Pollux handles source patterns, context caching, and multimodal complexity — so you don’t.

from pollux import run, Config, Source

result = await run(
    "Summarize the key findings",
    source=Source.from_file("paper.pdf"),
    config=Config(
        provider="gemini",
        model="gemini-2.5-flash-lite",
    ),
)
print(result["answers"][0])

Multimodal-first

PDFs, images, video, YouTube, arXiv. One interface, any source type.

Source patterns

Fan-out, fan-in, and broadcast execution over your content — no boilerplate.

Context caching

Upload once, reuse across prompts. Automatic TTL management saves tokens and money.

Built for reliability

Async pipeline, retries with backoff, structured output, usage tracking.

pip install pollux-ai