Observability for feature flags — health scoring, impact analysis, evaluation debugging, alerts, and real-time streaming
Monitoring Flags
This guide covers the observability tools built into Flaggr: health scoring, project-level dashboards, impact analysis before changes, evaluation debugging with trace, alerting, and real-time event streaming.
Health Scoring
Every flag with evaluation data receives a health score from 0 (critical) to 100 (healthy). Scores are computed from four weighted dimensions.
Scoring dimensions
| Dimension | Weight | What It Measures | Warning | Critical |
|---|---|---|---|---|
| Error rate | 35% | Fraction of evaluations that return errors | > 1% | > 5% |
| Evaluation latency | 25% | P99 evaluation time in milliseconds | > 50ms | > 200ms |
| Volume anomaly | 25% | Deviation from baseline evaluations/min | > 30% drop | > 70% drop |
| Staleness | 15% | Days since last evaluation | > 7 days | > 30 days |
Health statuses
| Score | Status | Meaning |
|---|---|---|
| 80-100 | healthy | All dimensions within normal range |
| 40-79 | warning | One or more dimensions approaching threshold |
| 0-39 | critical | Significant issues detected |
| — | unknown | No evaluation data available |
Viewing flag health
Via the API
# Single flag health
curl "https://flaggr.dev/api/flags/new-checkout-flow/health?serviceId=svc-abc123" \
-H "Authorization: Bearer <token>"Response:
{
"flagKey": "new-checkout-flow",
"serviceId": "svc-abc123",
"overallScore": 92,
"status": "healthy",
"dimensions": {
"errorRate": { "score": 100, "value": 0, "threshold": { "warning": 0.01, "critical": 0.05 } },
"evaluationLatency": { "score": 85, "value": 12.3, "threshold": { "warning": 50, "critical": 200 } },
"evaluationVolume": { "score": 90, "value": 45.2, "baseline": 42.0, "deviationPercent": 7.6 },
"staleness": { "score": 100, "value": 0.001, "daysSinceLastEval": 0.001 }
},
"lastUpdated": "2026-03-01T12:00:00.000Z"
}Each dimension exposes its raw value and thresholds so you can see exactly why a score is what it is.
Volume baselines
The health scorer automatically learns baselines from historical evaluation data. When a flag's evaluation volume drops significantly from its baseline, the volume anomaly dimension decreases. This catches scenarios like:
- A deployment removed a flag check
- A service stopped sending evaluation requests
- A routing change diverted traffic away
Project Health Dashboard
The project health dashboard aggregates health across all services and flags in a project.
Accessing the dashboard
Navigate to Project > Health in the console, or query the API:
curl "https://flaggr.dev/api/projects/proj-123/health" \
-H "Authorization: Bearer <token>"Dashboard response
{
"overallScore": 87,
"status": "healthy",
"services": [
{
"serviceId": "svc-web",
"serviceName": "Web Frontend",
"overallScore": 92,
"status": "healthy",
"flagCount": 12,
"healthyCount": 10,
"warningCount": 2,
"criticalCount": 0
},
{
"serviceId": "svc-api",
"serviceName": "API Gateway",
"overallScore": 78,
"status": "warning",
"flagCount": 8,
"healthyCount": 5,
"warningCount": 2,
"criticalCount": 1
}
],
"topIssues": [
{
"flagKey": "experimental-parser",
"serviceId": "svc-api",
"score": 35,
"status": "critical",
"worstDimension": "errorRate"
}
],
"totalFlags": 20,
"totalEvaluations": 1456000,
"projectId": "proj-123",
"lastUpdated": "2026-03-01T12:00:00.000Z"
}What to look for
- Top issues — flags with the lowest health scores, sorted worst-first. The
worstDimensionfield tells you which metric is causing the problem. - Service breakdown — quickly spot which service has the most warnings or critical flags.
- Total evaluations — 24-hour evaluation volume across the project.
The dashboard auto-refreshes every 60 seconds.
Impact Analysis
Before toggling or changing a flag, use impact analysis to understand how many users and requests would be affected.
Querying impact
curl "https://flaggr.dev/api/flags/new-checkout-flow/impact?serviceId=svc-abc123&environment=production" \
-H "Authorization: Bearer <token>"Impact response
{
"flagKey": "new-checkout-flow",
"serviceId": "svc-abc123",
"environment": "production",
"totalEvaluations": 84200,
"estimatedUniqueUsers": 1250,
"currentValueBreakdown": { "true": 63150, "false": 21050 },
"affectedPercentage": 75,
"evaluationsPerMinute": 58.5,
"peakEvaluationsPerMinute": 142,
"riskLevel": "medium",
"summary": "~84.2K evaluations in last 24h. 58.5/min. ~75% of evaluations would change value.",
"windowMs": 86400000
}Risk levels
| Level | Criteria | Recommendation |
|---|---|---|
low | < 10 evals/min and < 10K total | Safe to change directly |
medium | 10-100 evals/min or 10K-100K total | Consider a gradual rollout |
high | 100-1000 evals/min or > 100K total | Use staged rollout with monitoring |
critical | > 1000 evals/min or > 5% error rate | Requires careful planning |
Using impact data
The impact preview appears in the flag detail panel above the toggle button. Before toggling a high-traffic flag:
- Check the
evaluationsPerMinute— how actively is this flag being evaluated? - Review
currentValueBreakdown— what percentage of users are getting each value? - Consider
affectedPercentage— for boolean flags, this shows what fraction would see a change - If
riskLevelishighorcritical, use a gradual rollout instead of an instant toggle
Evaluation Debugging
The evaluation debugger provides an "EXPLAIN ANALYZE" view of flag evaluation — every decision point the evaluator walks through, with timing and match details.
Debug endpoint
curl -X POST "https://flaggr.dev/api/flags/evaluate-debug" \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"flagKey": "new-checkout-flow",
"serviceId": "svc-abc123",
"environment": "production",
"defaultValue": false,
"context": {
"targetingKey": "user-12345",
"plan": "pro",
"country": "US"
}
}'Trace response
{
"flagKey": "new-checkout-flow",
"value": true,
"reason": "TARGETING_MATCH",
"trace": {
"flagKey": "new-checkout-flow",
"totalMs": 3,
"result": { "value": true, "reason": "TARGETING_MATCH" },
"steps": [
{
"type": "disabled_check",
"label": "Flag is enabled",
"matched": false,
"durationMs": 0,
"details": {}
},
{
"type": "override",
"label": "Override \"QA testers\" (priority 10)",
"matched": false,
"durationMs": 0,
"details": { "identifiers": ["user-qa-1", "user-qa-2"], "targetingKey": "user-12345" }
},
{
"type": "targeting_rule",
"label": "Rule \"beta-users\" matched",
"matched": true,
"durationMs": 0,
"details": {
"ruleId": "beta-users",
"conditions": [
{ "property": "plan", "operator": "in", "expected": ["pro", "enterprise"], "actual": "pro", "matched": true }
],
"value": true
}
}
]
}
}Trace step types
| Step Type | Description |
|---|---|
disabled_check | Whether the flag is enabled or disabled |
override | Per-user override check (one step per override) |
prerequisite | Prerequisite flag evaluation result |
mutual_exclusion | Mutual exclusion group constraint check |
experiment | Running experiment variant assignment |
schedule_check | Whether a rule's schedule window is active |
targeting_rule | Targeting rule evaluation with condition details |
rollout_check | Percentage rollout hash bucket check |
variant | Weighted variant selection |
default | Fell through to default value |
Using the trace in the evaluate endpoint
You can also get a trace from the regular evaluate endpoint by adding _trace: true:
curl -X POST "https://flaggr.dev/api/flags/evaluate" \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"flagKey": "new-checkout-flow",
"serviceId": "svc-abc123",
"defaultValue": false,
"context": { "targetingKey": "user-12345" },
"_trace": true
}'The response includes a _debug field with the trace. The trace is only returned when explicitly requested — it adds no overhead to normal evaluations.
Debugging with the trace viewer
The dashboard includes a visual trace viewer component that renders the trace as a collapsible tree. Each step shows:
- Green/red icon indicating matched/not matched
- Step label describing the decision
- Expandable details with raw condition values, hash buckets, and timing
Alerting
Flaggr includes a configurable alerting system that monitors evaluation metrics and notifies you when thresholds are breached.
Alert channels
Alert channels define where notifications are sent. Supported channel types:
| Type | Description |
|---|---|
slack | Slack webhook URL |
email | Email address |
webhook | Custom HTTP endpoint |
pagerduty | PagerDuty integration key |
Creating a channel
curl -X POST "https://flaggr.dev/api/projects/proj-123/alert-channels" \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"name": "Engineering Slack",
"type": "slack",
"config": {
"webhookUrl": "https://hooks.slack.com/services/T.../B.../xxx"
},
"enabled": true
}'Testing a channel
curl -X POST "https://flaggr.dev/api/projects/proj-123/alert-channels/ch-123/test" \
-H "Authorization: Bearer <token>"Alert rules
Alert rules define what conditions trigger notifications.
Creating a rule
curl -X POST "https://flaggr.dev/api/projects/proj-123/alerts" \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"name": "High Error Rate",
"description": "Alert when evaluation error rate exceeds 1%",
"metric": "error_rate",
"operator": "above",
"threshold": 1,
"severity": "critical",
"channelIds": ["ch-123"],
"cooldownMinutes": 15,
"enabled": true
}'Alert delivery pipeline
When a rule fires, the alert passes through several reliability layers:
- Deduplication — SHA-256 hash of
ruleId + time-bucketprevents duplicate alerts within the cooldown window - Retry — 3 attempts with exponential backoff (1s, 2s, 4s). Non-retryable 4xx errors fail immediately.
- Circuit breaker — 3 consecutive failures per channel open the circuit for 5 minutes, then half-open for a probe
- Dead letter — Failed deliveries are stored (capped at 200 per project) for later inspection
Viewing alert firings
curl "https://flaggr.dev/api/projects/proj-123/alert-firings?limit=20" \
-H "Authorization: Bearer <token>"Remediation rules
For automated responses, configure remediation rules via the health scorer:
| Action | Description |
|---|---|
disable_flag | Automatically disable a flag when health degrades |
rollback_to_version | Roll back to a specific version snapshot |
reduce_rollout | Reduce rollout percentage to limit blast radius |
alert_only | Only send a notification (no automated action) |
Remediation rules support cooldownMinutes and requireApproval to prevent runaway automation.
Real-Time Streaming
Flag changes are broadcast in real time via Server-Sent Events (SSE) and Redis Pub/Sub.
SSE endpoint
curl -N "https://flaggr.dev/api/flags/stream?serviceId=svc-abc123" \
-H "Authorization: Bearer <token>"Events are sent as standard SSE:
event: FLAG_UPDATED
data: {"key":"new-checkout-flow","enabled":true,"defaultValue":true,...}
event: FLAG_TOGGLED
data: {"key":"new-checkout-flow","enabled":false,"defaultValue":false,...}
event: FLAG_CREATED
data: {"key":"new-feature","enabled":false,...}
event: FLAG_DELETED
data: {"key":"old-feature","serviceId":"svc-abc123"}
Event types
| Event | When |
|---|---|
FLAG_CREATED | A new flag is created in this service |
FLAG_UPDATED | A flag's configuration is changed |
FLAG_TOGGLED | A flag is toggled on or off |
FLAG_DELETED | A flag is deleted |
Delivery guarantees
- Primary: Redis Pub/Sub delivers events to all connected SSE clients within 10-50ms
- Fallback: When Redis is unavailable, clients fall back to 5-second polling
- Keepalive: A ping comment (
: keepalive) is sent every 30 seconds to keep connections alive
Using SSE in your application
const eventSource = new EventSource(
"/api/flags/stream?serviceId=svc-abc123",
{ headers: { Authorization: `Bearer ${token}` } }
);
eventSource.addEventListener("FLAG_TOGGLED", (event) => {
const flag = JSON.parse(event.data);
console.log(`Flag ${flag.key} is now ${flag.enabled ? "on" : "off"}`);
// Update local flag cache
});Connect / gRPC-Web streaming
For binary-efficient streaming, Flaggr also supports the Connect protocol at /api/grpc-web/[...path]. The Connect provider resolves flags via gRPC-Web and subscribes to real-time updates. See the Real-Time Updates guide for Connect setup.
Audit Logging
Every flag mutation (create, update, toggle, delete) generates an audit log entry containing:
- Who — user ID, name, email
- What — the action type and changed fields
- When — timestamp
- Before/After — previous and new flag state
Audit logs are captured asynchronously (fire-and-forget) so they don't slow down API responses.
Version History
Each flag change creates a version snapshot. Versions enable:
- Reviewing history — see what changed and when
- Rollback — restore a previous configuration
- Blame — who made each change
Versions record the full flag state, the action type (create, update, toggle), and the user who made the change. See the Versioning guide for rollback instructions.
Putting It All Together
A typical monitoring workflow:
- Before change — check impact analysis for the flag
- Make change — toggle or update the flag (use gradual rollout for high-risk changes)
- Monitor health — watch the flag's health score and the project dashboard
- Debug issues — if something goes wrong, use the evaluation debugger to trace a specific user's evaluation
- Get alerted — configure alert rules to notify you of degraded health
- Auto-remediate — set up remediation rules to automatically reduce blast radius
Related
- Creating Flags — initial flag setup
- Managing Flags — day-to-day operations
- Flag Health — deep dive into health scoring configuration
- Real-Time Updates — SSE and Connect streaming setup
- Troubleshooting — common issues and solutions