Skip to main content

Observability for feature flags — health scoring, impact analysis, evaluation debugging, alerts, and real-time streaming

Monitoring Flags

This guide covers the observability tools built into Flaggr: health scoring, project-level dashboards, impact analysis before changes, evaluation debugging with trace, alerting, and real-time event streaming.

Health Scoring

Every flag with evaluation data receives a health score from 0 (critical) to 100 (healthy). Scores are computed from four weighted dimensions.

Scoring dimensions

DimensionWeightWhat It MeasuresWarningCritical
Error rate35%Fraction of evaluations that return errors> 1%> 5%
Evaluation latency25%P99 evaluation time in milliseconds> 50ms> 200ms
Volume anomaly25%Deviation from baseline evaluations/min> 30% drop> 70% drop
Staleness15%Days since last evaluation> 7 days> 30 days

Health statuses

ScoreStatusMeaning
80-100healthyAll dimensions within normal range
40-79warningOne or more dimensions approaching threshold
0-39criticalSignificant issues detected
unknownNo evaluation data available

Viewing flag health

Via the API

# Single flag health
curl "https://flaggr.dev/api/flags/new-checkout-flow/health?serviceId=svc-abc123" \
  -H "Authorization: Bearer <token>"

Response:

{
  "flagKey": "new-checkout-flow",
  "serviceId": "svc-abc123",
  "overallScore": 92,
  "status": "healthy",
  "dimensions": {
    "errorRate": { "score": 100, "value": 0, "threshold": { "warning": 0.01, "critical": 0.05 } },
    "evaluationLatency": { "score": 85, "value": 12.3, "threshold": { "warning": 50, "critical": 200 } },
    "evaluationVolume": { "score": 90, "value": 45.2, "baseline": 42.0, "deviationPercent": 7.6 },
    "staleness": { "score": 100, "value": 0.001, "daysSinceLastEval": 0.001 }
  },
  "lastUpdated": "2026-03-01T12:00:00.000Z"
}

Each dimension exposes its raw value and thresholds so you can see exactly why a score is what it is.

Volume baselines

The health scorer automatically learns baselines from historical evaluation data. When a flag's evaluation volume drops significantly from its baseline, the volume anomaly dimension decreases. This catches scenarios like:

  • A deployment removed a flag check
  • A service stopped sending evaluation requests
  • A routing change diverted traffic away

Project Health Dashboard

The project health dashboard aggregates health across all services and flags in a project.

Accessing the dashboard

Navigate to Project > Health in the console, or query the API:

curl "https://flaggr.dev/api/projects/proj-123/health" \
  -H "Authorization: Bearer <token>"

Dashboard response

{
  "overallScore": 87,
  "status": "healthy",
  "services": [
    {
      "serviceId": "svc-web",
      "serviceName": "Web Frontend",
      "overallScore": 92,
      "status": "healthy",
      "flagCount": 12,
      "healthyCount": 10,
      "warningCount": 2,
      "criticalCount": 0
    },
    {
      "serviceId": "svc-api",
      "serviceName": "API Gateway",
      "overallScore": 78,
      "status": "warning",
      "flagCount": 8,
      "healthyCount": 5,
      "warningCount": 2,
      "criticalCount": 1
    }
  ],
  "topIssues": [
    {
      "flagKey": "experimental-parser",
      "serviceId": "svc-api",
      "score": 35,
      "status": "critical",
      "worstDimension": "errorRate"
    }
  ],
  "totalFlags": 20,
  "totalEvaluations": 1456000,
  "projectId": "proj-123",
  "lastUpdated": "2026-03-01T12:00:00.000Z"
}

What to look for

  • Top issues — flags with the lowest health scores, sorted worst-first. The worstDimension field tells you which metric is causing the problem.
  • Service breakdown — quickly spot which service has the most warnings or critical flags.
  • Total evaluations — 24-hour evaluation volume across the project.

The dashboard auto-refreshes every 60 seconds.

Impact Analysis

Before toggling or changing a flag, use impact analysis to understand how many users and requests would be affected.

Querying impact

curl "https://flaggr.dev/api/flags/new-checkout-flow/impact?serviceId=svc-abc123&environment=production" \
  -H "Authorization: Bearer <token>"

Impact response

{
  "flagKey": "new-checkout-flow",
  "serviceId": "svc-abc123",
  "environment": "production",
  "totalEvaluations": 84200,
  "estimatedUniqueUsers": 1250,
  "currentValueBreakdown": { "true": 63150, "false": 21050 },
  "affectedPercentage": 75,
  "evaluationsPerMinute": 58.5,
  "peakEvaluationsPerMinute": 142,
  "riskLevel": "medium",
  "summary": "~84.2K evaluations in last 24h. 58.5/min. ~75% of evaluations would change value.",
  "windowMs": 86400000
}

Risk levels

LevelCriteriaRecommendation
low< 10 evals/min and < 10K totalSafe to change directly
medium10-100 evals/min or 10K-100K totalConsider a gradual rollout
high100-1000 evals/min or > 100K totalUse staged rollout with monitoring
critical> 1000 evals/min or > 5% error rateRequires careful planning

Using impact data

The impact preview appears in the flag detail panel above the toggle button. Before toggling a high-traffic flag:

  1. Check the evaluationsPerMinute — how actively is this flag being evaluated?
  2. Review currentValueBreakdown — what percentage of users are getting each value?
  3. Consider affectedPercentage — for boolean flags, this shows what fraction would see a change
  4. If riskLevel is high or critical, use a gradual rollout instead of an instant toggle

Evaluation Debugging

The evaluation debugger provides an "EXPLAIN ANALYZE" view of flag evaluation — every decision point the evaluator walks through, with timing and match details.

Debug endpoint

curl -X POST "https://flaggr.dev/api/flags/evaluate-debug" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "flagKey": "new-checkout-flow",
    "serviceId": "svc-abc123",
    "environment": "production",
    "defaultValue": false,
    "context": {
      "targetingKey": "user-12345",
      "plan": "pro",
      "country": "US"
    }
  }'

Trace response

{
  "flagKey": "new-checkout-flow",
  "value": true,
  "reason": "TARGETING_MATCH",
  "trace": {
    "flagKey": "new-checkout-flow",
    "totalMs": 3,
    "result": { "value": true, "reason": "TARGETING_MATCH" },
    "steps": [
      {
        "type": "disabled_check",
        "label": "Flag is enabled",
        "matched": false,
        "durationMs": 0,
        "details": {}
      },
      {
        "type": "override",
        "label": "Override \"QA testers\" (priority 10)",
        "matched": false,
        "durationMs": 0,
        "details": { "identifiers": ["user-qa-1", "user-qa-2"], "targetingKey": "user-12345" }
      },
      {
        "type": "targeting_rule",
        "label": "Rule \"beta-users\" matched",
        "matched": true,
        "durationMs": 0,
        "details": {
          "ruleId": "beta-users",
          "conditions": [
            { "property": "plan", "operator": "in", "expected": ["pro", "enterprise"], "actual": "pro", "matched": true }
          ],
          "value": true
        }
      }
    ]
  }
}

Trace step types

Step TypeDescription
disabled_checkWhether the flag is enabled or disabled
overridePer-user override check (one step per override)
prerequisitePrerequisite flag evaluation result
mutual_exclusionMutual exclusion group constraint check
experimentRunning experiment variant assignment
schedule_checkWhether a rule's schedule window is active
targeting_ruleTargeting rule evaluation with condition details
rollout_checkPercentage rollout hash bucket check
variantWeighted variant selection
defaultFell through to default value

Using the trace in the evaluate endpoint

You can also get a trace from the regular evaluate endpoint by adding _trace: true:

curl -X POST "https://flaggr.dev/api/flags/evaluate" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "flagKey": "new-checkout-flow",
    "serviceId": "svc-abc123",
    "defaultValue": false,
    "context": { "targetingKey": "user-12345" },
    "_trace": true
  }'

The response includes a _debug field with the trace. The trace is only returned when explicitly requested — it adds no overhead to normal evaluations.

Debugging with the trace viewer

The dashboard includes a visual trace viewer component that renders the trace as a collapsible tree. Each step shows:

  • Green/red icon indicating matched/not matched
  • Step label describing the decision
  • Expandable details with raw condition values, hash buckets, and timing

Alerting

Flaggr includes a configurable alerting system that monitors evaluation metrics and notifies you when thresholds are breached.

Alert channels

Alert channels define where notifications are sent. Supported channel types:

TypeDescription
slackSlack webhook URL
emailEmail address
webhookCustom HTTP endpoint
pagerdutyPagerDuty integration key

Creating a channel

curl -X POST "https://flaggr.dev/api/projects/proj-123/alert-channels" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Engineering Slack",
    "type": "slack",
    "config": {
      "webhookUrl": "https://hooks.slack.com/services/T.../B.../xxx"
    },
    "enabled": true
  }'

Testing a channel

curl -X POST "https://flaggr.dev/api/projects/proj-123/alert-channels/ch-123/test" \
  -H "Authorization: Bearer <token>"

Alert rules

Alert rules define what conditions trigger notifications.

Creating a rule

curl -X POST "https://flaggr.dev/api/projects/proj-123/alerts" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "High Error Rate",
    "description": "Alert when evaluation error rate exceeds 1%",
    "metric": "error_rate",
    "operator": "above",
    "threshold": 1,
    "severity": "critical",
    "channelIds": ["ch-123"],
    "cooldownMinutes": 15,
    "enabled": true
  }'

Alert delivery pipeline

When a rule fires, the alert passes through several reliability layers:

  1. Deduplication — SHA-256 hash of ruleId + time-bucket prevents duplicate alerts within the cooldown window
  2. Retry — 3 attempts with exponential backoff (1s, 2s, 4s). Non-retryable 4xx errors fail immediately.
  3. Circuit breaker — 3 consecutive failures per channel open the circuit for 5 minutes, then half-open for a probe
  4. Dead letter — Failed deliveries are stored (capped at 200 per project) for later inspection

Viewing alert firings

curl "https://flaggr.dev/api/projects/proj-123/alert-firings?limit=20" \
  -H "Authorization: Bearer <token>"

Remediation rules

For automated responses, configure remediation rules via the health scorer:

ActionDescription
disable_flagAutomatically disable a flag when health degrades
rollback_to_versionRoll back to a specific version snapshot
reduce_rolloutReduce rollout percentage to limit blast radius
alert_onlyOnly send a notification (no automated action)

Remediation rules support cooldownMinutes and requireApproval to prevent runaway automation.

Real-Time Streaming

Flag changes are broadcast in real time via Server-Sent Events (SSE) and Redis Pub/Sub.

SSE endpoint

curl -N "https://flaggr.dev/api/flags/stream?serviceId=svc-abc123" \
  -H "Authorization: Bearer <token>"

Events are sent as standard SSE:

event: FLAG_UPDATED
data: {"key":"new-checkout-flow","enabled":true,"defaultValue":true,...}

event: FLAG_TOGGLED
data: {"key":"new-checkout-flow","enabled":false,"defaultValue":false,...}

event: FLAG_CREATED
data: {"key":"new-feature","enabled":false,...}

event: FLAG_DELETED
data: {"key":"old-feature","serviceId":"svc-abc123"}

Event types

EventWhen
FLAG_CREATEDA new flag is created in this service
FLAG_UPDATEDA flag's configuration is changed
FLAG_TOGGLEDA flag is toggled on or off
FLAG_DELETEDA flag is deleted

Delivery guarantees

  • Primary: Redis Pub/Sub delivers events to all connected SSE clients within 10-50ms
  • Fallback: When Redis is unavailable, clients fall back to 5-second polling
  • Keepalive: A ping comment (: keepalive) is sent every 30 seconds to keep connections alive

Using SSE in your application

const eventSource = new EventSource(
  "/api/flags/stream?serviceId=svc-abc123",
  { headers: { Authorization: `Bearer ${token}` } }
);
 
eventSource.addEventListener("FLAG_TOGGLED", (event) => {
  const flag = JSON.parse(event.data);
  console.log(`Flag ${flag.key} is now ${flag.enabled ? "on" : "off"}`);
  // Update local flag cache
});

Connect / gRPC-Web streaming

For binary-efficient streaming, Flaggr also supports the Connect protocol at /api/grpc-web/[...path]. The Connect provider resolves flags via gRPC-Web and subscribes to real-time updates. See the Real-Time Updates guide for Connect setup.

Audit Logging

Every flag mutation (create, update, toggle, delete) generates an audit log entry containing:

  • Who — user ID, name, email
  • What — the action type and changed fields
  • When — timestamp
  • Before/After — previous and new flag state

Audit logs are captured asynchronously (fire-and-forget) so they don't slow down API responses.

Version History

Each flag change creates a version snapshot. Versions enable:

  • Reviewing history — see what changed and when
  • Rollback — restore a previous configuration
  • Blame — who made each change

Versions record the full flag state, the action type (create, update, toggle), and the user who made the change. See the Versioning guide for rollback instructions.

Putting It All Together

A typical monitoring workflow:

  1. Before change — check impact analysis for the flag
  2. Make change — toggle or update the flag (use gradual rollout for high-risk changes)
  3. Monitor health — watch the flag's health score and the project dashboard
  4. Debug issues — if something goes wrong, use the evaluation debugger to trace a specific user's evaluation
  5. Get alerted — configure alert rules to notify you of degraded health
  6. Auto-remediate — set up remediation rules to automatically reduce blast radius