Safely deploy features with staged rollout plans, automatic safety checks, and instant rollback capabilities

Progressive Rollouts

Progressive rollouts let you deploy features gradually across your user base instead of flipping a flag for everyone at once. Define rollout plans with multiple stages, safety checks between each stage, and automatic rollback if things go wrong.

How Rollouts Work

A rollout plan controls the percentage of users who see a feature over time:

Draft → Start → 5% (canary) → 25% → 50% → 100% (complete)
                    ↓              ↓       ↓
              health check   health check  health check

Each stage runs for a configurable duration, and you can require health checks to pass before advancing to the next stage.

Rollout Plan Lifecycle

  ┌───────┐    start    ┌────────┐   advance   ┌───────────┐
  │ Draft │ ──────────> │ Active │ ──────────> │ Completed │
  └───────┘             └────────┘             └───────────┘
                           │  ▲
                     pause │  │ resume
                           ▼  │
                        ┌────────┐
                        │ Paused │
                        └────────┘
                           │
                  rollback │
                           ▼
                     ┌─────────────┐
                     │ Rolled Back │
                     └─────────────┘

Status	Description
`draft`	Plan created but not started. Stages can be edited.
`active`	Rollout in progress. One stage is active at a time.
`paused`	Temporarily halted. Can be resumed.
`completed`	All stages finished. Feature is at 100%.
`rolled_back`	Rollout was reversed. Feature returns to previous state.

Creating a Rollout Plan

curl -X POST /api/rollouts \
  -H "Content-Type: application/json" \
  -d '{
    "id": "rollout-checkout-v2",
    "flagKey": "checkout-v2",
    "serviceId": "svc-web",
    "projectId": "proj-123",
    "name": "Checkout V2 Progressive Rollout",
    "strategy": { "type": "staged" },
    "stages": [
      {
        "id": "canary",
        "name": "Canary",
        "percentage": 5,
        "durationMinutes": 30,
        "autoAdvance": false,
        "status": "pending"
      },
      {
        "id": "early-adopters",
        "name": "Early Adopters",
        "percentage": 25,
        "durationMinutes": 120,
        "autoAdvance": false,
        "status": "pending"
      },
      {
        "id": "half",
        "name": "50%",
        "percentage": 50,
        "durationMinutes": 240,
        "autoAdvance": false,
        "status": "pending"
      },
      {
        "id": "full",
        "name": "Full Rollout",
        "percentage": 100,
        "durationMinutes": 0,
        "autoAdvance": true,
        "status": "pending"
      }
    ],
    "targetValue": true,
    "createdBy": "alice@example.com",
    "safetyChecks": [
      { "type": "error_rate", "threshold": 0.05, "action": "rollback" },
      { "type": "latency_p99", "threshold": 500, "action": "pause" }
    ]
  }'

Stage Properties

Property	Type	Description
`id`	string	Unique stage identifier
`name`	string	Human-readable stage name
`percentage`	number	Percentage of users who see the feature (0–100)
`durationMinutes`	number	How long this stage runs before advancing (0 = indefinite)
`autoAdvance`	boolean	Automatically advance to next stage after duration
`status`	string	`pending`, `active`, `completed`, `failed`, `skipped`

Starting a Rollout

curl -X POST /api/rollouts/rollout-checkout-v2 \
  -H "Content-Type: application/json" \
  -d '{ "action": "start" }'

This activates the plan and starts the first stage. The flag's rollout percentage is set to the first stage's percentage (e.g., 5%).

Advancing Stages

When you're satisfied that the current stage is healthy, advance to the next:

curl -X POST /api/rollouts/rollout-checkout-v2 \
  -H "Content-Type: application/json" \
  -d '{ "action": "advance" }'

The response includes the updated plan and the newly active stage:

{
  "plan": { "status": "active", "..." : "..." },
  "stage": { "name": "Early Adopters", "percentage": 25, "status": "active" }
}

When the last stage is advanced, the plan status changes to completed.

Safety Checks

Safety checks automatically protect your rollout by monitoring key metrics:

{
  "safetyChecks": [
    {
      "type": "error_rate",
      "threshold": 0.05,
      "action": "rollback"
    },
    {
      "type": "latency_p99",
      "threshold": 500,
      "action": "pause"
    },
    {
      "type": "crash_rate",
      "threshold": 0.01,
      "action": "rollback"
    }
  ]
}

Property	Description
`type`	Metric name to check (custom — matches keys in health check payloads)
`threshold`	Maximum acceptable value. Exceeding this triggers the action.
`action`	`rollback` (revert to 0%) or `pause` (halt and wait for manual review)

Reporting Health Checks

Push health metrics to the rollout plan during each stage:

curl -X POST /api/rollouts/rollout-checkout-v2 \
  -H "Content-Type: application/json" \
  -d '{
    "action": "health_check",
    "passed": true,
    "metrics": {
      "error_rate": 0.002,
      "latency_p99": 180,
      "crash_rate": 0.0001
    }
  }'

If any metric exceeds its safety check threshold and passed is false, the configured action fires automatically.

Pausing and Resuming

If something looks concerning but doesn't trigger a safety check, pause manually:

# Pause
curl -X POST /api/rollouts/rollout-checkout-v2 \
  -d '{ "action": "pause" }'
 
# Resume
curl -X POST /api/rollouts/rollout-checkout-v2 \
  -d '{ "action": "resume" }'

Pausing preserves the current stage and percentage. Resuming continues from where you left off.

Rollback

Instantly revert a rollout:

curl -X POST /api/rollouts/rollout-checkout-v2 \
  -d '{
    "action": "rollback",
    "reason": "Elevated error rates in checkout flow"
  }'

Rollback:

Sets the plan status to rolled_back
Marks the active stage as failed
Marks all pending stages as skipped
The effective rollout percentage drops to 0%

Querying Rollout Status

Get Plan with Events

curl /api/rollouts/rollout-checkout-v2

Response includes the plan, current percentage, and the full event timeline:

{
  "id": "rollout-checkout-v2",
  "status": "active",
  "currentPercentage": 25,
  "stages": [ "..." ],
  "events": [
    { "type": "stage_started", "stageName": "Canary", "percentage": 5, "timestamp": "..." },
    { "type": "stage_completed", "stageName": "Canary", "percentage": 5, "timestamp": "..." },
    { "type": "stage_started", "stageName": "Early Adopters", "percentage": 25, "timestamp": "..." }
  ]
}

List Plans for a Flag

curl "/api/rollouts?flagKey=checkout-v2&serviceId=svc-web"

List Plans for a Project

curl "/api/rollouts?projectId=proj-123"

One Active Plan Per Flag

A flag can only have one active or paused rollout plan at a time. Attempting to create a second plan while one is in progress returns a 409 Conflict:

{
  "error": {
    "code": "CONFLICT",
    "message": "Flag already has an active rollout plan: rollout-checkout-v2"
  }
}

Complete or roll back the existing plan first, then create a new one.

Rollout Strategies

The strategy.type field identifies the rollout approach. Currently supported:

Strategy	Description
`staged`	Fixed percentage stages with manual or auto advancement
`linear`	Linear ramp from 0% to 100% over a duration (planned)
`canary`	Small canary group then instant full rollout (planned)
`ring`	Ring-based deployment (internal → early → GA) (planned)

API Reference

Endpoint	Method	Description
`/api/rollouts?projectId={id}`	GET	List rollout plans for a project
`/api/rollouts?flagKey={key}&serviceId={id}`	GET	List rollout plans for a flag
`/api/rollouts`	POST	Create a new rollout plan
`/api/rollouts/{id}`	GET	Get plan with events and current percentage
`/api/rollouts/{id}`	POST	Execute action (start, advance, pause, resume, rollback, health_check)
`/api/rollouts/{id}`	DELETE	Delete a plan (draft, completed, or rolled_back only)

Example: Full Rollout Workflow

# 1. Create the plan
curl -X POST /api/rollouts -d '{ ... }'
 
# 2. Start (activates first stage: 5%)
curl -X POST /api/rollouts/my-plan -d '{ "action": "start" }'
 
# 3. Wait, monitor, then report health
curl -X POST /api/rollouts/my-plan -d '{
  "action": "health_check",
  "passed": true,
  "metrics": { "error_rate": 0.001 }
}'
 
# 4. Advance to 25%
curl -X POST /api/rollouts/my-plan -d '{ "action": "advance" }'
 
# 5. Something breaks — rollback immediately
curl -X POST /api/rollouts/my-plan -d '{
  "action": "rollback",
  "reason": "Payment processing errors spiked"
}'

Best Practices

Always start with a canary stage (1–5%) to catch issues early
Set safety checks for error rate, latency, and business metrics
Use autoAdvance: false for production-critical features — require manual approval
Monitor between stages — don't rush through stages even if metrics look good
Keep rollback reasons descriptive — they're recorded in the event audit log
One flag, one rollout — don't run competing rollout plans on the same flag