Private beta · v0.4.1

The orchestration
runtime for AI workflows

PlexRun is the infrastructure layer for building, deploying, and scaling AI-powered automation pipelines. Run multi-agent systems, manage workflow state, and observe execution — without managing infrastructure.

Request early access Read the docs

Serverless by defaultOpen source SDKsSOC 2 planned

plexrun · ~/customer-triage

Sub-second

Workflow cold starts

Per-step

Execution tracing

Git-native

Version & rollback

Works with the tools and models your team already uses

OpenAI

Anthropic

LangChain

Kubernetes

Docker

PostgreSQL

Redis

Next.js

Python

TypeScript

Terraform

OpenTelemetry

OpenAI

Anthropic

LangChain

Kubernetes

Docker

PostgreSQL

Redis

Next.js

Python

TypeScript

Terraform

OpenTelemetry

Early access — accepting design partners

⬡

Built for production

Not a prototype. Every design decision — retries, idempotency, state isolation — is made for production workloads, not demos.

◈

Developer-native

Python SDK, CLI, and YAML-first. Define workflows in code, version them in Git, deploy with one command.

◎

Full observability

Per-step execution traces, token usage, latency, and cost attribution. Every failure is debuggable in seconds, not hours.

Who it's built for

ML Engineers

Infrastructure

Tired of writing retry logic and queue management from scratch for every new pipeline.

Backend Engineers

Observability

Deploying AI agents to production means debugging across 5 different tools with zero execution trace.

Engineering Leaders

Reliability

Agent pipelines that work in demos fail in production. The gap is always orchestration, not the model.

The problem

AI in production is
an infrastructure problem.

Most AI workflows die between the demo and production. The model works. The orchestration doesn't. These are the four problems every team hits.

Fragmented orchestration

Engineering debt

Teams stitch together LLMs, queues, schedulers, and APIs into brittle pipelines that fail in production. Every new model or tool means more glue code.

Scaling agents is hard

Infrastructure ceiling

Multi-agent systems hit infrastructure limits fast — concurrency, shared state, retries, and backpressure are unsolved. You either over-provision or things break.

Zero observability

Operational pain

When an AI workflow fails at 2am, debugging means grepping scattered logs across five tools with no execution trace. Mean time to fix is hours, not minutes.

Reliability is an afterthought

Production incidents

LLM calls fail. APIs time out. Queues back up. Most teams have no retry policy, no idempotency, and no recovery path — until they're paged at midnight.

How it works

From idea to production
in three steps.

PlexRun abstracts infrastructure complexity so your team can focus on building workflows, not managing servers.

Step 01

Define your workflow

Describe your AI pipeline in Python or YAML. Declare steps, models, branching logic, and retry policies.

steps:

- name: extract

model: claude-3-sonnet

retry: 3

- name: classify

fan_out: true

Step 02

Deploy in seconds

One command. PlexRun provisions compute, queues, and state stores automatically.

$ plexrun deploy

Packaging workflow...

Provisioning workers...

✓ Deployed in 2.1s

https://app.plexrun.com/w/doc-pipeline

Step 03

Monitor and iterate

Every execution is fully traced. Debug failures, inspect token usage, replay steps.

extract142ms

classify98ms

index61ms

Platform

Infrastructure for AI workflows — solved.

Everything you need to ship AI automation to production. No glue code, no hand-rolled retry logic, no custom orchestration layer.

Workflow Engine

Core

DAG-based execution with native support for branching, fan-out, fan-in, and conditional routing. Define complex pipelines in Python or YAML — no state management required.

workflow.py

@workflow("document-pipeline")

def process(doc):

chunks = split(doc) # fan-out

results = embed.map(chunks) # parallel

return index(results) # fan-in

Multi-Agent Orchestration

Run coordinated agent swarms with shared state, message passing, and supervised execution.

Distributed Execution

Auto-scaling worker pools across regions. Sub-second cold starts on serverless runtime.

Real-Time Observability

Per-step traces, token usage, latency, and cost. Replay any execution. Export to OpenTelemetry.

Retries & Idempotency

Exponential backoff, circuit breakers, dead-letter queues, and idempotent step keys built in.

Event-Driven Triggers

Schedule, webhook, queue, or file-based triggers. React to any upstream system in real time.

Versioning & Rollback

Immutable workflow deployments. Promote between environments. One-command rollback.

Secure by Default

VPC isolation, encrypted state, audit logs, and least-privilege execution by default. RBAC and SSO on Enterprise.

API-First

Python and TypeScript SDKs. REST API. CLI. Webhooks. Integrate with anything.

Architecture

Production infrastructure, out of the box.

PlexRun runs on AWS-native infrastructure — purpose-built for AI workloads. Serverless. Auto-scaling. Multi-region. Observable.

PlexRun · AWS us-east-1

infra.yaml

Ingestion Layer

Client SDK

Python · TS · CLI

AWS

API Gateway

AWS API Gateway

AWS

Lambda Orchestrator

AWS Lambda

Orchestration Layer

AWS

Step Functions

AWS Step Functions

AWS

Event Queue

Amazon SQS + SNS

AWS

State Manager

DynamoDB Streams

AI Execution Layer

AWS

Bedrock LLMs

Claude · Titan · Llama

Agent Runtime

Multi-agent pool

AWS

ECS Workers

AWS Fargate

Storage & Observability

AWS

DynamoDB

Workflow state

AWS

S3 Storage

Artifacts · Logs

AWS

CloudWatch

Metrics · Traces

Compute

Lambda · ECS · Fargate

AI / ML

Bedrock · SageMaker

Storage

S3 · DynamoDB

Observability

CloudWatch · X-Ray

Use cases

Built for teams shipping real AI.

All features

Support

Customer Support Automation

Triage tickets, route to specialists, generate first-pass replies, and escalate edge cases — all in a single observable workflow.

Documents

Document Processing

Extract, classify, validate, and route documents through multi-stage AI pipelines with full audit trails.

Research

Research & Enrichment

Crawl, synthesize, and structure unstructured data using coordinated agent workflows at scale.

Agents

Multi-Agent Systems

Run agent teams with shared memory, role specialization, and supervised execution across complex task graphs.

Data

Data Transformation

ETL pipelines with AI-native steps. Classify, summarize, and enrich millions of records without hand-rolled retry logic.

Automation

Internal Tool Automation

Replace fragile Zapier flows with reliable, observable AI-powered automations your on-call can actually debug.

How we compare

Built for AI workflows.
Not adapted from them.

Existing orchestration tools were built for data pipelines and microservices. PlexRun is the first orchestration runtime designed from the ground up for LLM-native, multi-agent workloads.

Feature	PlexRunours	Step Functions	Prefect	Temporal

AI-native step model Designed for LLM calls, not generic tasks
Per-step token tracking Token usage + cost attributed per step, per run
Idempotent retries Retry without re-calling the LLM on success
Zero infra to manage Serverless by default, no fleet to operate
Python-native SDK Define workflows with decorators, no DSL
Battle-tested at scale Proven in large-scale production deployments
Built-in observability Execution traces without extra instrumentation
Multi-model routing Route steps to different LLMs independently

Based on publicly available documentation as of 2026. Feature parity may vary by version.

Accepting design partners

Build with us.
Shape the roadmap.

We're working with a small cohort of engineering teams who are actively building AI-powered products. Design partners get direct access to the team, free platform usage during beta, and a permanent seat at the table on product direction.

We're looking for teams building real AI products in production — not side projects. Ideal partners have an existing orchestration pain point they need solved in the next 30–90 days.

Apply as design partner Join waitlist instead

Direct engineering access

Weekly 1:1 with the founding team. Your use case shapes the product roadmap directly.

Free platform usage

Full platform access at no cost throughout private beta. No credit card, no commitments.

Custom integrations

We'll build the connectors, triggers, or SDK methods your stack needs — on request.

Early adopter pricing

Lock in founding-customer rates that never increase as you scale to production.

Developer experience

Code-first. Infrastructure invisible.

Define workflows in Python, TypeScript, or YAML. Deploy with one command. PlexRun handles the rest — provisioning, scaling, retries, observability.

Type-safe SDKs for Python & TypeScript
Declarative YAML for GitOps workflows
Local dev mode with hot reload
Built-in tracing and replay debugging
Native integrations: OpenAI, Bedrock, Anthropic, Cohere

from plexrun import Workflow, step, agent
from plexrun.aws import bedrock

workflow = Workflow(class="tok-string">"document-pipeline", runtime=class="tok-string">"aws-lambda")

@step(retry=3, timeout=60, queue=class="tok-string">"docs-input")
async def extract_content(document: bytes) -> dict:
    class="tok-string">""class="tok-string">"Extract structured content via Bedrock."class="tok-string">""
    return await bedrock.invoke(
        model=class="tok-string">"anthropic.claude-3-sonnet",
        prompt=fclass="tok-string">"Extract key fields: {document}",
    )

@step(depends_on=[class="tok-string">"extract_content"], timeout=30)
async def classify_intent(content: dict) -> str:
    classifier = agent(class="tok-string">"intent-classifier", model=class="tok-string">"claude-3-haiku")
    return await classifier.run(content)

@step(depends_on=[class="tok-string">"classify_intent"])
async def route(intent: str, content: dict) -> None:
    await queue.publish(fclass="tok-string">"workflow.{intent}", content)

workflow.deploy(region=class="tok-string">"us-east-1", concurrency=100)

Community

Stay in the loop.

We build in public. Follow along on GitHub or read the changelog — we ship every two weeks.

GitHub

github.com/Rahull-Mishra/plexrun-sdk

Source code, SDK, CLI, examples, and issue tracking. Star to follow updates.

View on GitHub

Changelog

plexrun.com/changelog

What shipped, what's next, and technical notes from the team. Updated every two weeks.

Read the changelog

We build in public · Shipping every 2 weeks

Pricing

Pay for execution. Not infrastructure.

Usage-based pricing. No seats. No infrastructure surprises.

Free

$0forever

For solo builders and prototypes.

Start free

10,000 workflow runs / month
1 concurrent worker
Community support
7-day execution history
Open-source SDKs

Recommended

Pro

$49per month

For production AI workloads.

Request early access

1M workflow runs / month
10 concurrent workers
Priority email support
90-day execution history
Custom domains & webhooks
Multi-region deployment
Advanced observability

Enterprise

Customannual

For scale, compliance, and SLAs.

Contact sales

Unlimited workflow runs
Dedicated infrastructure
VPC peering & PrivateLink
Custom retention & audit logs
Compliance roadmap (SOC 2 planned pre-GA)
Uptime SLA on request
Dedicated solutions engineer

All plans include open-source SDKs, public docs, and community support.

Early access

Get early access.
Shape the product.

Join the waitlist for private beta access, release notes, and a direct line to the team. No spam — ever.

Free to start · No credit card · Cancel anytime

Private beta — limited spotsGDPR-aligned data handlingBuilt in public · Shipping every 2 weeks

FAQ

Common questions

Can't find what you're looking for? Contact our team.

LangChain provides primitives for building LLM apps. PlexRun provides the runtime to execute them at production scale — distributed workers, retries, queuing, state management, and observability. We compose well together: many users build with LangChain and deploy on PlexRun.

The orchestrationruntime for AI workflows