Skip to main content

Common issues and solutions for Flaggr — auth errors, stale data, evaluation problems, and performance

Troubleshooting

This guide covers common issues and their solutions. If you don't find your answer here, check the GitHub Issues.

Authentication Errors

"Unauthorized" (401) on API Requests

Symptoms: API returns 401 Unauthorized even with a valid-looking token.

Checklist:

  1. Token format: Opaque tokens start with fgr_ or flg_. JWT tokens are three dot-separated base64 segments.
  2. Header format: Must be Authorization: Bearer <token> (note the space after "Bearer").
  3. Token expiration: JWT tokens have an expiresAt field. Check if the token has expired.
  4. Project access: The token must belong to the project you're querying. Tokens are project-scoped.
# Test your token
curl -s -H "Authorization: Bearer flg_your_token" \
  https://flaggr.dev/api/flags?projectId=proj-123

Firebase ID Token Errors

Symptoms: Firebase-authenticated users get 401 errors on Connect/OFREP endpoints.

Cause: Firebase ID tokens look like JWT tokens (3 dot-separated segments). If Flaggr tries to verify them as Flaggr JWTs first and fails, it should fall through to Firebase auth.

Solution: This was fixed in the auth middleware — Firebase tokens now correctly fall through to the opaque/session auth path. If you're running an older version, update to the latest.

Token Refresh Failures

Symptoms: JWT token refresh returns 401.

Checklist:

  • Refresh tokens expire independently from access tokens. Check refreshTokenExpiresAt.
  • Refresh tokens are single-use — each refresh returns a new refresh token.
  • The original access token must belong to the same token record.

Stale Data After Mutations

Flag Changes Not Reflected

Symptoms: You toggle/update a flag but the old value persists.

Cause: Flaggr has multiple caching layers that can serve stale data.

Solutions by layer:

LayerFix
React Query (client)Use queryClient.removeQueries() instead of invalidateQueries() before navigation
HTTP Cache (browser/CDN)Mutation-sensitive endpoints use no-cache, must-revalidate headers
Next.js RSC (server)API routes call revalidatePath() after data changes
Redis/KV (server)Mutation handlers call cacheDel() for affected keys

Quick fix: Hard refresh the page (Cmd+Shift+R / Ctrl+Shift+R) to bypass browser cache.

Stale Data After Delete

Symptoms: After deleting a project/service/flag, the UI redirects back to the deleted entity.

Cause: Cache still contains the deleted entity. Navigation happens before cache is cleared.

Solution: Follow the delete pattern:

// 1. Delete
await deleteMutation.mutateAsync(id);
// 2. Clear state
clearSelectedProject();
// 3. REMOVE cached data (not just invalidate)
queryClient.removeQueries({ queryKey: queryKeys.projects });
// 4. Navigate
router.replace("/admin");

Flag Evaluation Issues

Flag Returns Default Value When It Shouldn't

Checklist:

  1. Is the flag enabled? Disabled flags always return the default value (reason: DISABLED).
  2. Correct environment? A flag enabled in staging won't return a non-default value in production.
  3. Correct service? Flags are scoped to a service ID. Make sure your SDK uses the right one.
  4. Targeting rules match? Check that the evaluation context contains the properties your rules reference.
  5. Rollout percentage? If set to less than 100%, consistent hashing determines which users are included.
  6. Prerequisites? A prerequisite flag must return its expected value, or the dependent flag returns default.
  7. Mutual exclusion? If the flag is in a group and another flag won, it returns default with reason MUTUAL_EXCLUSION.

Debugging:

# Evaluate with full details
curl -X POST /api/flags/evaluate \
  -H "Authorization: Bearer flg_your_token" \
  -H "Content-Type: application/json" \
  -d '{
    "flagKey": "checkout-v2",
    "serviceId": "web-app",
    "environment": "production",
    "defaultValue": false,
    "context": { "targetingKey": "user-123", "plan": "enterprise" }
  }'

Check the reason field in the response:

ReasonMeaning
TARGETING_MATCHA rule matched — value is the rule's variant/value
DEFAULTNo rules matched
DISABLEDFlag is disabled
SPLITPercentage rollout assigned this variant
OVERRIDEIdentity override applied
PREREQUISITE_FAILEDPrerequisite check failed
MUTUAL_EXCLUSIONAnother flag in the exclusion group won
ERROREvaluation error — check errorMessage

Inconsistent Evaluation Results

Symptoms: Same user gets different flag values across requests.

Cause: Usually missing or inconsistent targetingKey.

Solution: Always pass a stable targetingKey (user ID) in the evaluation context. Rollout percentages use consistent hashing on the targeting key — without it, results are random.

Import/Export Issues

Import Validation Errors

Common errors:

ErrorCauseFix
"Flag key must start with a letter"Key starts with number/symbolRename to start with a letter
"Service not found"Service ID doesn't existCreate the service first, or fix the service ID
"Invalid flag type"Type not one of boolean/string/number/objectFix the type field
"Default value must be a boolean"Type mismatchEnsure default value matches declared type
"Maximum 100 flags per import"Too many flags in one requestSplit into multiple batches

Pro tip: Always run with dryRun: true first to catch all validation errors without modifying data.

"Flag already exists" Errors

Cause: Using conflictResolution: "error" and the flag already exists.

Solutions:

  • Use "skip" to leave existing flags unchanged
  • Use "overwrite" to update existing flags
  • Delete existing flags first if you want a clean import

Real-Time Updates Not Working

SSE Stream Not Connecting

Checklist:

  1. API token: SSE streams require authentication. Pass the token in the request.
  2. CORS: If connecting from a browser, ensure your domain is in the project's allowed domains.
  3. Proxy/CDN: Some proxies buffer SSE responses. Ensure your infrastructure supports streaming.
  4. Connection limits: Browsers limit concurrent connections per domain (typically 6).

Updates Delayed

Symptoms: Flag changes take 5+ seconds to appear in the stream.

Cause: When Redis Pub/Sub is unavailable, Flaggr falls back to 5-second polling.

Solution: Configure Redis for real-time updates (~10-50ms latency):

UPSTASH_REDIS_REST_URL=https://your-db.upstash.io
UPSTASH_REDIS_REST_TOKEN=AXxx...

Rate Limiting

"Too Many Requests" (429)

Symptoms: API returns 429 Too Many Requests.

Cause: Rate limit exceeded for your IP, token, or service.

Solutions:

  1. Check the Retry-After header for when you can retry
  2. Reduce request frequency — use caching or batch evaluation
  3. If legitimate traffic, review your rate limiting configuration

Rate Limits Not Shared Across Instances

Cause: Without Upstash Redis, rate limiting uses in-memory storage that isn't shared across serverless instances.

Solution: Configure Upstash Redis for production.

Environment Variable Issues

"Invalid URL" Errors

Symptoms: Cryptic errors like Upstash Redis client was passed an invalid URL.

Cause: Trailing whitespace or newlines in environment variables, common when copying from dashboards.

Solution: Trim your environment variables. Flaggr trims critical variables internally, but verify:

# Check for trailing whitespace
echo -n "$UPSTASH_REDIS_REST_URL" | cat -A

Firebase Private Key Errors

Symptoms: Error: Invalid PEM formatted message or similar.

Cause: The FIREBASE_PRIVATE_KEY must preserve newline characters (\n). Different platforms handle this differently.

Solutions by platform:

PlatformFormat
.env.localWrap in double quotes: FIREBASE_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\nMIIE...\n-----END PRIVATE KEY-----\n"
VercelPaste the key as-is (Vercel handles multiline values)
DockerUse --env-file or escape newlines in -e

Performance

Slow API Responses (6-24 seconds)

Cause: Serial database queries instead of parallel.

Pattern to avoid:

// SLOW — serial queries
for (const m of memberships) {
  const project = await getProject(m.projectId);  // Blocks on each one
}

Fix: Use parallel queries:

// FAST — parallel queries
const projects = await Promise.all(
  memberships.map(m => getProject(m.projectId))
);

High Latency on First Request

Cause: Cold start in serverless environments. The first request after idle triggers function initialization.

Mitigation:

  • Vercel: Enable "Always On" for production functions
  • Use caching (Vercel KV) to reduce database calls
  • Keep functions warm with periodic health checks

Build Errors

TypeScript Errors After Updating

# Check for type errors
npx tsc --noEmit
 
# If types are stale, clean and rebuild
rm -rf .next
npm run build

Proto Generation Issues

If protobuf types don't match:

cd proto
buf generate

Getting Help