Features

Everything you need
to run AI in production.

PlexRun is an opinionated runtime — not a collection of utilities. Every feature is designed to work together so your team can ship reliable AI workflows without reinventing the infrastructure.

Request early access

Workflow Engine

Core

DAG-based execution with native support for branching, fan-out, fan-in, and conditional routing. Define complex pipelines in Python or YAML without managing execution state manually.

  • Directed acyclic graph (DAG) model for pipeline definition
  • Conditional branching based on step outputs
  • Fan-out to N parallel branches, fan-in with aggregation
  • Nested sub-workflows for modular composition

Multi-Agent Orchestration

Core

Run coordinated agent swarms with shared state, message passing, and supervised execution. PlexRun handles agent lifecycle, inter-agent communication, and failure isolation.

  • Shared state store across agents in a workflow
  • Typed message passing between agent steps
  • Supervisor patterns with automatic re-spawning
  • Human-in-the-loop step types for approval flows

Distributed Execution

Infrastructure

Auto-scaling worker pools across regions. Sub-second cold starts on serverless runtime. Run thousands of concurrent workflows without managing infrastructure.

  • Serverless-first with sub-second cold starts
  • Dedicated worker pools for Enterprise
  • Multi-region deployment (us-east-1, eu-west-1, ap-southeast-1)
  • Auto-scaling based on queue depth

Real-Time Observability

Observability

Per-step traces, token usage, latency, and cost. Replay any execution from any point. Export spans and metrics to your existing observability stack via OpenTelemetry.

  • Per-step execution traces with timestamps
  • Token usage and cost attribution per step
  • Full execution replay from any checkpoint
  • OpenTelemetry export for spans, metrics, and logs

Retries & Idempotency

Reliability

Exponential backoff, circuit breakers, dead-letter queues, and idempotent step keys built in. Never worry about duplicate execution or cascading failures again.

  • Configurable retry policies per step
  • Exponential backoff with jitter
  • Circuit breaker patterns for downstream services
  • Dead-letter queues for unrecoverable failures

Event-Driven Triggers

Integrations

Trigger workflows from webhooks, cron schedules, queue messages, or API calls. React to events from any upstream system in real time.

  • Webhook triggers with signature verification
  • Cron and interval-based scheduling
  • SQS / Kafka message queue integration
  • REST API trigger with sync or async response

Versioning & Rollback

Developer

Every deployed workflow is versioned. Compare revisions, deploy to staging before production, and roll back to any previous version in one command.

  • Immutable versioned workflow deployments
  • Side-by-side revision comparison
  • Staged rollouts (canary, blue/green)
  • One-command rollback via CLI or API

Enterprise Security

Security

SOC 2 in progress. TLS in transit, AES-256 at rest. RBAC, VPC peering, and SSO for Enterprise. Your data never leaves your cloud account.

  • Role-based access control (RBAC)
  • VPC peering and PrivateLink for Enterprise
  • SSO / SAML 2.0 support
  • Audit logs with tamper-evident storage

API-First Architecture

Developer

Every capability exposed via REST API and Python SDK. Integrate PlexRun into your existing CI/CD, dashboards, and tooling without friction.

  • Fully documented REST API
  • Python SDK with async support
  • Terraform provider for infrastructure-as-code
  • Webhooks for all workflow lifecycle events