Open beta · v0.4.0

The orchestration
runtime for AI workflows

PlexRun is the infrastructure layer for building, deploying, and scaling AI-powered automation pipelines. Run multi-agent systems, manage workflow state, and observe execution — without managing infrastructure.

Serverless by defaultOpen source SDKsSOC 2 in progress
plexrun · ~/customer-triage
<1ms
Orchestration overhead
99.95%
Execution reliability
10M+
Steps / day capacity

Works with the tools and models your team already uses

OpenAI
Anthropic
LangChain
Kubernetes
Docker
PostgreSQL
Redis
Next.js
Python
TypeScript
Terraform
OpenTelemetry
OpenAI
Anthropic
LangChain
Kubernetes
Docker
PostgreSQL
Redis
Next.js
Python
TypeScript
Terraform
OpenTelemetry
Early access — accepting design partners

Built for production

Not a prototype. Every design decision — retries, idempotency, state isolation — is made for production workloads, not demos.

Developer-native

Python SDK, CLI, and YAML-first. Define workflows in code, version them in Git, deploy with one command.

Full observability

Per-step execution traces, token usage, latency, and cost attribution. Every failure is debuggable in seconds, not hours.

Who it's built for

ML Engineers

Infrastructure

Tired of writing retry logic and queue management from scratch for every new pipeline.

Backend Engineers

Observability

Deploying AI agents to production means debugging across 5 different tools with zero execution trace.

Engineering Leaders

Reliability

Agent pipelines that work in demos fail in production. The gap is always orchestration, not the model.

The problem

AI in production is
an infrastructure problem.

Most AI workflows die between the demo and production. The model works. The orchestration doesn't. These are the four problems every team hits.

01

Fragmented orchestration

Engineering debt

Teams stitch together LLMs, queues, schedulers, and APIs into brittle pipelines that fail in production. Every new model or tool means more glue code.

02

Scaling agents is hard

Infrastructure ceiling

Multi-agent systems hit infrastructure limits fast — concurrency, shared state, retries, and backpressure are unsolved. You either over-provision or things break.

03

Zero observability

Operational pain

When an AI workflow fails at 2am, debugging means grepping scattered logs across five tools with no execution trace. Mean time to fix is hours, not minutes.

04

Reliability is an afterthought

Production incidents

LLM calls fail. APIs time out. Queues back up. Most teams have no retry policy, no idempotency, and no recovery path — until they're paged at midnight.

How it works

From idea to production
in three steps.

PlexRun abstracts infrastructure complexity so your team can focus on building workflows, not managing servers.

Step 01

Define your workflow

Describe your AI pipeline in Python or YAML. Declare steps, models, branching logic, and retry policies.

steps:
- name: extract
model: claude-3-sonnet
retry: 3
- name: classify
fan_out: true
Step 02

Deploy in seconds

One command. PlexRun provisions compute, queues, and state stores automatically.

$ plexrun deploy
Packaging workflow...
Provisioning workers...
✓ Deployed in 2.1s
https://app.plexrun.com/w/doc-pipeline
Step 03

Monitor and iterate

Every execution is fully traced. Debug failures, inspect token usage, replay steps.

extract142ms
classify98ms
index61ms

Platform

Infrastructure for AI workflows — solved.

Everything you need to ship AI automation to production. No glue code, no hand-rolled retry logic, no custom orchestration layer.

Workflow Engine

Core

DAG-based execution with native support for branching, fan-out, fan-in, and conditional routing. Define complex pipelines in Python or YAML — no state management required.

workflow.py
@workflow("document-pipeline")
def process(doc):
chunks = split(doc) # fan-out
results = embed.map(chunks) # parallel
return index(results) # fan-in

Multi-Agent Orchestration

Run coordinated agent swarms with shared state, message passing, and supervised execution.

Distributed Execution

Auto-scaling worker pools across regions. Sub-second cold starts on serverless runtime.

Real-Time Observability

Per-step traces, token usage, latency, and cost. Replay any execution. Export to OpenTelemetry.

Retries & Idempotency

Exponential backoff, circuit breakers, dead-letter queues, and idempotent step keys built in.

Event-Driven Triggers

Schedule, webhook, queue, or file-based triggers. React to any upstream system in real time.

Versioning & Rollback

Immutable workflow deployments. Promote between environments. One-command rollback.

Secure by Default

VPC isolation, encrypted state, audit logs, and least-privilege execution by default. RBAC and SSO on Enterprise.

API-First

Python and TypeScript SDKs. REST API. CLI. Webhooks. Integrate with anything.

Architecture

Production infrastructure, out of the box.

PlexRun runs on AWS-native infrastructure — purpose-built for AI workloads. Serverless. Auto-scaling. Multi-region. Observable.

PlexRun · AWS us-east-1
infra.yaml
Ingestion Layer
Client SDK
Python · TS · CLI
AWS
API Gateway
AWS API Gateway
AWS
Lambda Orchestrator
AWS Lambda
Orchestration Layer
AWS
Step Functions
AWS Step Functions
AWS
Event Queue
Amazon SQS + SNS
AWS
State Manager
DynamoDB Streams
AI Execution Layer
AWS
Bedrock LLMs
Claude · Titan · Llama
Agent Runtime
Multi-agent pool
AWS
ECS Workers
AWS Fargate
Storage & Observability
AWS
DynamoDB
Workflow state
AWS
S3 Storage
Artifacts · Logs
AWS
CloudWatch
Metrics · Traces
Compute
Lambda · ECS · Fargate
AI / ML
Bedrock · SageMaker
Storage
S3 · DynamoDB
Observability
CloudWatch · X-Ray

Use cases

Built for teams shipping real AI.

All features
Support
01

Customer Support Automation

Triage tickets, route to specialists, generate first-pass replies, and escalate edge cases — all in a single observable workflow.

Documents
02

Document Processing

Extract, classify, validate, and route documents through multi-stage AI pipelines with full audit trails.

Research
03

Research & Enrichment

Crawl, synthesize, and structure unstructured data using coordinated agent workflows at scale.

Agents
04

Multi-Agent Systems

Run agent teams with shared memory, role specialization, and supervised execution across complex task graphs.

Data
05

Data Transformation

ETL pipelines with AI-native steps. Classify, summarize, and enrich millions of records without hand-rolled retry logic.

Automation
06

Internal Tool Automation

Replace fragile Zapier flows with reliable, observable AI-powered automations your on-call can actually debug.

How we compare

Built for AI workflows.
Not adapted from them.

Existing orchestration tools were built for data pipelines and microservices. PlexRun is the first orchestration runtime designed from the ground up for LLM-native, multi-agent workloads.

FeaturePlexRunoursStep FunctionsPrefectTemporal
AI-native step model
Designed for LLM calls, not generic tasks
Per-step token tracking
Token usage + cost attributed per step, per run
Idempotent retries
Retry without re-calling the LLM on success
Zero infra to manage
Serverless by default, no fleet to operate
Python-native SDK
Define workflows with decorators, no DSL
YAML workflow definition
Declarative pipelines with DAG support
Built-in observability
Execution traces without extra instrumentation
Multi-model support
Route steps to different LLMs independently

Based on publicly available documentation as of 2025. Feature parity may vary by version.

Accepting design partners

Build with us.
Shape the roadmap.

We're working with a small cohort of engineering teams who are actively building AI-powered products. Design partners get direct access to the team, free platform usage during beta, and a permanent seat at the table on product direction.

We're looking for teams building real AI products in production — not side projects. Ideal partners have an existing orchestration pain point they need solved in the next 30–90 days.

01

Direct engineering access

Weekly 1:1 with the founding team. Your use case shapes the product roadmap directly.

02

Free platform usage

Full platform access at no cost throughout private beta. No credit card, no commitments.

03

Custom integrations

We'll build the connectors, triggers, or SDK methods your stack needs — on request.

04

Early adopter pricing

Lock in founding-customer rates that never increase as you scale to production.

Developer experience

Code-first. Infrastructure invisible.

Define workflows in Python, TypeScript, or YAML. Deploy with one command. PlexRun handles the rest — provisioning, scaling, retries, observability.

  • Type-safe SDKs for Python & TypeScript
  • Declarative YAML for GitOps workflows
  • Local dev mode with hot reload
  • Built-in tracing and replay debugging
  • Native integrations: OpenAI, Bedrock, Anthropic, Cohere
from plexrun import Workflow, step, agent
from plexrun.aws import bedrock

workflow = Workflow(class="tok-string">"document-pipeline", runtime=class="tok-string">"aws-lambda")

@step(retry=3, timeout=60, queue=class="tok-string">"docs-input")
async def extract_content(document: bytes) -> dict:
    class="tok-string">""class="tok-string">"Extract structured content via Bedrock."class="tok-string">""
    return await bedrock.invoke(
        model=class="tok-string">"anthropic.claude-3-sonnet",
        prompt=fclass="tok-string">"Extract key fields: {document}",
    )

@step(depends_on=[class="tok-string">"extract_content"], timeout=30)
async def classify_intent(content: dict) -> str:
    classifier = agent(class="tok-string">"intent-classifier", model=class="tok-string">"claude-3-haiku")
    return await classifier.run(content)

@step(depends_on=[class="tok-string">"classify_intent"])
async def route(intent: str, content: dict) -> None:
    await queue.publish(fclass="tok-string">"workflow.{intent}", content)

workflow.deploy(region=class="tok-string">"us-east-1", concurrency=100)

Pricing

Pay for execution. Not infrastructure.

Usage-based pricing. No seats. No infrastructure surprises.

Free

$0forever

For solo builders and prototypes.

Start free
  • 10,000 workflow runs / month
  • 1 concurrent worker
  • Community Slack support
  • 7-day execution history
  • Open-source SDKs
Most popular

Pro

$49per month

For production AI workloads.

Request early access
  • 1M workflow runs / month
  • 10 concurrent workers
  • Priority email support
  • 90-day execution history
  • Custom domains & webhooks
  • Multi-region deployment
  • Advanced observability

Enterprise

Customannual

For scale, compliance, and SLAs.

Contact sales
  • Unlimited workflow runs
  • Dedicated infrastructure
  • VPC peering & PrivateLink
  • Custom retention & audit logs
  • Compliance roadmap (SOC 2 planned pre-GA)
  • Uptime SLA on request
  • Dedicated solutions engineer

All plans include open-source SDKs, public docs, and community support.

Early access

Get early access.
Shape the product.

Join the waitlist for private beta access, release notes, and a direct line to the team. No spam — ever.

Free to start · No credit card · Cancel anytime

Private beta — limited spotsGDPR-aligned data handlingBuilt in public · Shipping every 2 weeks

FAQ

Common questions

Can't find what you're looking for? Contact our team.

LangChain provides primitives for building LLM apps. PlexRun provides the runtime to execute them at production scale — distributed workers, retries, queuing, state management, and observability. We compose well together: many users build with LangChain and deploy on PlexRun.