Why feature flags replace the feature branch
The traditional model — develop on a branch, merge when done, deploy when approved — couples code deployment to feature release. A single problematic commit delays the entire release. A rollback reverts every change shipped with it. Feature flags break that coupling. You deploy code continuously but control which users see which behaviour through runtime configuration. The new checkout flow ships to production on Monday but only activates for 1% of users. If error rates spike, you flip a switch. No redeployment. No rollback. Recovery in seconds.
At scale, feature flags enable trunk-based development: every engineer merges to main multiple times a day, and long-running branches disappear entirely. They also separate the concerns of QA, product, and operations — QA tests a feature in production for internal users before it goes public; product managers control a gradual percentage rollout; operations can kill a flag the moment an SLO is breached. Tools like LaunchDarkly and the open standard OpenFeature give engineering teams the infrastructure to do this safely across microservices and polyglot stacks.
Core Concepts — Flag Types, Targeting, and Evaluation
Boolean flag
The simplest flag type: on or off. Used for kill switches, enabling new API endpoints, or toggling entire subsystems. Evaluated at runtime against a targeting rule set. Default state is always off; activation is explicit.
Multivariate flag
Returns one of N string, number, or JSON values instead of a boolean. Used for A/B/n tests (three checkout button colours), configuration delivery (algorithm version, cache TTL, rate limit threshold), and progressive UI experiments where the variation is more than on/off.
Targeting rule
A condition evaluated against the evaluation context (user ID, email, plan, country, request metadata). Rules are evaluated top-down; the first match wins. If no rule matches, the flag falls back to the default variation. Targeting rules enable employee-only betas, company-specific early access, and geo-restricted features.
Percentage rollout
A targeting rule that assigns users deterministically to a variation bucket using a hash of their user ID and the flag key. The same user always gets the same variation — no flickering between requests. You increase the percentage over time: 1% → 5% → 20% → 50% → 100%.
Evaluation context
The structured data passed to the flag evaluation SDK on every call: user ID, anonymous flag, custom attributes (plan, region, beta_enrolled). Rich context enables precise targeting. Anonymous users can be bucketed by session ID or device ID for consistent experiences before login.
LaunchDarkly SDK Integration — Server-Side and Client-Side
LaunchDarkly provides SDKs for over 20 languages and frameworks. Server-side SDKs connect to LaunchDarkly's streaming API, maintain a local in-memory flag cache, and evaluate flags without a network round-trip on every call. The SDK streams flag updates in real time — flag changes propagate to all SDK instances within milliseconds. Client-side SDKs operate differently: they receive a pre-evaluated flag payload for the current user from LaunchDarkly's edge CDN, eliminating server-side latency for frontend applications.
# Python server-side SDK — LaunchDarkly 9.x
# pip install launchdarkly-server-sdk
import ldclient
from ldclient.config import Config
from ldclient import Context
# Initialise once at application startup (not per-request)
ldclient.set_config(Config("sdk-YOUR-SDK-KEY"))
client = ldclient.get()
# Wait for SDK to connect to streaming API and populate local flag cache
if not client.is_initialized():
raise RuntimeError("LaunchDarkly SDK failed to initialize")
def evaluate_checkout_flag(user_id: str, plan: str, country: str) -> str:
"""
Returns the checkout flow variation for the given user.
Evaluation is in-process — no network call, <1ms.
"""
context = (
Context.builder(user_id)
.kind("user")
.set("plan", plan)
.set("country", country)
.build()
)
# Boolean flag: is the new checkout enabled for this user?
new_checkout_enabled: bool = client.variation(
"new-checkout-flow", # flag key
context,
False, # default value if flag is unreachable
)
if new_checkout_enabled:
return "new"
# Multivariate string flag: which payment gateway variant?
gateway_variant: str = client.variation(
"payment-gateway-variant",
context,
"stripe-v1", # default: existing Stripe integration
)
return gateway_variant
def get_rate_limit(service_id: str, tier: str) -> int:
"""Deliver rate limit configuration via a multivariate number flag."""
context = (
Context.builder(service_id)
.kind("service")
.set("tier", tier)
.build()
)
# Default 1000 req/min; flag overrides per tier without redeployment
return client.variation("api-rate-limit-rps", context, 1000)
# Register a listener to log flag change events
def on_flag_change(flag_key: str) -> None:
print(f"[LD] Flag updated: {flag_key}")
client.flag_tracker.add_flag_change_listener(on_flag_change)Note
// TypeScript — LaunchDarkly server-side SDK in an Express service
// npm install @launchdarkly/node-server-sdk
import * as ld from "@launchdarkly/node-server-sdk";
import express from "express";
const app = express();
// Initialise SDK once; the client maintains a persistent streaming connection
const ldClient = ld.init("sdk-YOUR-SDK-KEY");
async function bootstrap() {
await ldClient.waitForInitialization({ timeout: 5 });
console.log("LaunchDarkly SDK initialized");
}
bootstrap().catch((err) => {
console.error("LD SDK failed to initialize:", err);
process.exit(1);
});
app.get("/api/feature-config", async (req, res) => {
const userId = req.headers["x-user-id"] as string;
const plan = req.headers["x-user-plan"] as string ?? "free";
const context: ld.LDContext = {
kind: "user",
key: userId,
plan,
anonymous: !userId,
};
// Boolean: is the new recommendations engine active?
const recsEnabled = await ldClient.variation(
"new-recommendations-engine",
context,
false
);
// JSON flag: delivery a full config object without redeployment
const searchConfig = await ldClient.variation(
"search-config",
context,
{ maxResults: 10, fuzzy: false, boostRecent: false }
);
res.json({ recsEnabled, searchConfig });
});
// Graceful shutdown — flush pending analytics events
process.on("SIGTERM", async () => {
await ldClient.flush();
ldClient.close();
});OpenFeature — The Vendor-Neutral Standard
OpenFeature is a CNCF incubating project that defines a standard API for feature flag evaluation, independent of any specific vendor. You write application code against the OpenFeature SDK; a provider adapts the OpenFeature interface to a specific backend — LaunchDarkly, Flagsmith, Unleash, CloudBees, or your own in-house flag service. Switching providers requires changing one line of configuration, not rewriting all your flag evaluation calls.
OpenFeature also defines hooks — middleware that runs before and after flag evaluation — enabling cross-cutting concerns like telemetry, logging, and validation without modifying flag call sites. The standard is gaining rapid adoption as the industry moves toward portability between flag backends.
# Python — OpenFeature with LaunchDarkly provider
# pip install openfeature-sdk openfeature-provider-launchdarkly
from openfeature import api
from openfeature.evaluation_context import EvaluationContext
from openfeature_provider_launchdarkly import LaunchDarklyProvider
# Register the provider once at startup
# Swap this line to change vendors — all evaluation code stays unchanged
api.set_provider(LaunchDarklyProvider(sdk_key="sdk-YOUR-SDK-KEY"))
client = api.get_client()
def is_new_dashboard_enabled(user_id: str, company: str) -> bool:
ctx = EvaluationContext(
targeting_key=user_id,
attributes={"company": company, "beta_enrolled": True},
)
# OpenFeature API — identical regardless of which provider is registered
return client.get_boolean_value(
flag_key="new-dashboard",
default_value=False,
evaluation_context=ctx,
)
def get_algorithm_version(user_id: str) -> str:
ctx = EvaluationContext(targeting_key=user_id)
return client.get_string_value(
flag_key="ranking-algorithm",
default_value="v1",
evaluation_context=ctx,
)
def get_cache_config(service: str) -> dict:
ctx = EvaluationContext(targeting_key=service)
return client.get_object_value(
flag_key="cache-config",
default_value={"ttl_seconds": 300, "max_size_mb": 512},
evaluation_context=ctx,
)// TypeScript — OpenFeature with in-process Flagd provider
// Flagd is a CNCF open-source flag daemon you self-host
// npm install @openfeature/server-sdk @openfeature/flagd-provider
import { OpenFeature } from "@openfeature/server-sdk";
import { FlagdProvider } from "@openfeature/flagd-provider";
// Switch between providers without touching evaluation code
OpenFeature.setProvider(
new FlagdProvider({
host: "flagd.internal",
port: 8013,
tls: false,
})
);
const featureClient = OpenFeature.getClient();
// Hook: log every flag evaluation for audit trail
OpenFeature.addHooks({
after(hookContext, evaluationDetails) {
console.log({
flag: hookContext.flagKey,
user: hookContext.context.targetingKey,
value: evaluationDetails.value,
reason: evaluationDetails.reason,
});
},
error(hookContext, err) {
console.error(`Flag evaluation error [${hookContext.flagKey}]:`, err);
},
});
export async function resolveUserFlags(userId: string, plan: string) {
const ctx = { targetingKey: userId, plan };
const [darkModeEnabled, maxUploadMb, exportFormat] = await Promise.all([
featureClient.getBooleanValue("dark-mode", false, ctx),
featureClient.getNumberValue("max-upload-size-mb", 25, ctx),
featureClient.getStringValue("export-format", "csv", ctx),
]);
return { darkModeEnabled, maxUploadMb, exportFormat };
}Self-Hosted Flags with Flagd — Zero-Dependency Flag Evaluation
Flagd is the CNCF reference implementation of a flag evaluation daemon — a lightweight Go binary that serves the OpenFeature remote evaluation protocol. It reads flag configuration from a JSON or YAML file (which can be stored in a ConfigMap and synced from Git), exposes a gRPC and HTTP API, and supports in-process evaluation with no external network dependency. For teams that cannot send evaluation data to a SaaS vendor, flagd provides a fully self-contained flag infrastructure.
# flagd/flags.json — flag configuration file (stored in Git, mounted as ConfigMap)
# flagd reads this file and hot-reloads on change
{
"$schema": "https://flagd.dev/schema/v0/flags.json",
"flags": {
"new-checkout-flow": {
"state": "ENABLED",
"variants": { "on": true, "off": false },
"defaultVariant": "off",
"targeting": {
"if": [
{
"in": [{ "var": "company" }, ["acme", "globex", "initech"]]
},
"on",
{
"<=": [{ "var": "percentileHash" }, 10]
},
"on",
"off"
]
}
},
"payment-gateway-variant": {
"state": "ENABLED",
"variants": {
"stripe-v1": "stripe-v1",
"stripe-v2": "stripe-v2",
"adyen": "adyen"
},
"defaultVariant": "stripe-v1",
"targeting": {
"if": [
{ "==": [{ "var": "plan" }, "enterprise"] },
"adyen",
{
"<=": [{ "var": "percentileHash" }, 25]
},
"stripe-v2",
"stripe-v1"
]
}
},
"maintenance-mode": {
"state": "ENABLED",
"variants": { "on": true, "off": false },
"defaultVariant": "off"
}
}
}# kubernetes/flagd-deployment.yaml — run flagd as a sidecar or standalone service
apiVersion: apps/v1
kind: Deployment
metadata:
name: flagd
namespace: platform
spec:
replicas: 2
selector:
matchLabels: { app: flagd }
template:
metadata:
labels: { app: flagd }
spec:
containers:
- name: flagd
image: ghcr.io/open-feature/flagd:v0.10.2
args:
- start
- --uri
- file:/etc/flagd/flags.json
- --metrics-exporter
- otel
ports:
- { name: grpc, containerPort: 8013 }
- { name: http, containerPort: 8016 }
- { name: metrics, containerPort: 8014 }
volumeMounts:
- name: flags-config
mountPath: /etc/flagd
readinessProbe:
httpGet: { path: /healthz, port: 8016 }
initialDelaySeconds: 3
resources:
requests: { cpu: "100m", memory: "64Mi" }
limits: { cpu: "500m", memory: "256Mi" }
volumes:
- name: flags-config
configMap:
name: flagd-flags
---
apiVersion: v1
kind: ConfigMap
metadata:
name: flagd-flags
namespace: platform
data:
flags.json: |
{ "$schema": "https://flagd.dev/schema/v0/flags.json", "flags": {} }Note
Safe Rollout Patterns — Canary, Percentage, and Kill Switch
Feature flags enable a layered rollout strategy that is far safer than deploying code to all users simultaneously. The three core patterns are the internal canary (enable for employees first), percentage rollout (expand the ring of users over time while monitoring SLOs), and the kill switch (instant off, available at all times regardless of rollout progress). Used together, they give you continuous deployment with the risk profile of a manual approval gate.
# Python — wrapping a flag evaluation in a service layer with SLO-aware rollout logic
#
# Pattern: flags are evaluated centrally; business logic never calls the LD SDK directly.
# This makes mocking trivial in tests and keeps flag keys in one place.
from dataclasses import dataclass
from openfeature import api
from openfeature.evaluation_context import EvaluationContext
import prometheus_client as prom
# Metrics for SLO monitoring — alert on error_rate > threshold before expanding rollout
FLAG_EVALUATIONS = prom.Counter(
"feature_flag_evaluations_total", "Flag evaluations", ["flag", "variant", "reason"]
)
FLAG_ERRORS = prom.Counter(
"feature_flag_errors_total", "Flag evaluation errors", ["flag"]
)
@dataclass(frozen=True)
class UserContext:
user_id: str
plan: str
country: str
beta_enrolled: bool = False
internal: bool = False
class FeatureService:
def __init__(self):
self._client = api.get_client()
def _ctx(self, user: UserContext) -> EvaluationContext:
return EvaluationContext(
targeting_key=user.user_id,
attributes={
"plan": user.plan,
"country": user.country,
"beta_enrolled": user.beta_enrolled,
"internal": user.internal,
},
)
def is_new_checkout_enabled(self, user: UserContext) -> bool:
try:
details = self._client.get_boolean_details(
"new-checkout-flow", False, self._ctx(user)
)
FLAG_EVALUATIONS.labels(
flag="new-checkout-flow",
variant=str(details.value),
reason=details.reason or "UNKNOWN",
).inc()
return details.value
except Exception:
FLAG_ERRORS.labels(flag="new-checkout-flow").inc()
return False # fail closed — return safe default
def is_maintenance_mode(self) -> bool:
"""Kill switch: evaluated without user context — applies globally."""
try:
return self._client.get_boolean_value(
"maintenance-mode", False, EvaluationContext(targeting_key="system")
)
except Exception:
return False# Rollout strategy implementation: orchestrated via CI/CD + LaunchDarkly API
# Run this script from your deployment pipeline to advance the rollout ring
import os
import httpx
LD_API_KEY = os.environ["LD_API_KEY"]
PROJECT_KEY = "my-project"
ENVIRONMENT = "production"
FLAG_KEY = "new-checkout-flow"
BASE_URL = f"https://app.launchdarkly.com/api/v2/flags/{PROJECT_KEY}/{FLAG_KEY}"
HEADERS = {"Authorization": LD_API_KEY, "Content-Type": "application/json"}
def set_percentage_rollout(percentage: int) -> None:
"""
Advance the percentage rollout for new-checkout-flow.
Called from CI/CD after each healthy canary window.
"""
patch = [
{
"op": "replace",
"path": f"/environments/{ENVIRONMENT}/rules/0/variation",
"value": {
"rollout": {
"variations": [
{"variation": 0, "weight": percentage * 1000}, # true
{"variation": 1, "weight": (100 - percentage) * 1000}, # false
],
"bucketBy": "key", # deterministic by user ID
}
},
}
]
resp = httpx.patch(BASE_URL, json=patch, headers=HEADERS)
resp.raise_for_status()
print(f"Rollout advanced to {percentage}%")
def kill_switch_off() -> None:
"""Immediately disable the flag for all users — used by incident response."""
patch = [
{
"op": "replace",
"path": f"/environments/{ENVIRONMENT}/on",
"value": False,
}
]
resp = httpx.patch(BASE_URL, json=patch, headers=HEADERS)
resp.raise_for_status()
print("Kill switch activated — flag is OFF for all users")
# Typical rollout progression:
# set_percentage_rollout(1) # Day 0: 1% canary — monitor for 1h
# set_percentage_rollout(5) # Day 1: 5% — monitor for 4h
# set_percentage_rollout(25) # Day 2: 25%
# set_percentage_rollout(50) # Day 3: 50%
# set_percentage_rollout(100) # Day 5: full rolloutTesting Strategies — Mocking Flags in Unit and Integration Tests
The cardinal rule of testing with feature flags is: never hit a real flag backend in unit tests. Tests that depend on LaunchDarkly's streaming API are slow, flaky, and consume quota. Instead, inject a mock provider or override flag values in the test environment. Both LaunchDarkly and OpenFeature have first-class support for this pattern.
# Python — testing with OpenFeature's in-memory provider
# pip install openfeature-sdk
import pytest
from openfeature import api
from openfeature.provider.in_memory_provider import InMemoryProvider, InMemoryFlag
from openfeature.evaluation_context import EvaluationContext
from myapp.features import FeatureService
from myapp.models import UserContext
@pytest.fixture(autouse=True)
def patch_feature_flags():
"""Replace the real LaunchDarkly provider with an in-memory stub for all tests."""
provider = InMemoryProvider(
flags={
"new-checkout-flow": InMemoryFlag(
default_variant="off",
variants={"on": True, "off": False},
enabled=True,
),
"payment-gateway-variant": InMemoryFlag(
default_variant="stripe-v1",
variants={
"stripe-v1": "stripe-v1",
"stripe-v2": "stripe-v2",
"adyen": "adyen",
},
enabled=True,
),
"maintenance-mode": InMemoryFlag(
default_variant="off",
variants={"on": True, "off": False},
enabled=True,
),
}
)
api.set_provider(provider)
yield provider
def test_new_checkout_disabled_by_default():
svc = FeatureService()
user = UserContext(user_id="u1", plan="free", country="US")
assert svc.is_new_checkout_enabled(user) is False
def test_new_checkout_enabled_for_beta_user(patch_feature_flags):
# Override a single flag to "on" variant for this specific test
patch_feature_flags.flags["new-checkout-flow"].default_variant = "on"
svc = FeatureService()
user = UserContext(user_id="u2", plan="pro", country="GB", beta_enrolled=True)
assert svc.is_new_checkout_enabled(user) is True
def test_maintenance_mode_kill_switch(patch_feature_flags):
patch_feature_flags.flags["maintenance-mode"].default_variant = "on"
svc = FeatureService()
assert svc.is_maintenance_mode() is True
def test_gateway_variant_default():
svc = FeatureService()
from openfeature.evaluation_context import EvaluationContext
ctx = EvaluationContext(targeting_key="u3", attributes={"plan": "free"})
result = svc._client.get_string_value("payment-gateway-variant", "stripe-v1", ctx)
assert result == "stripe-v1"Note
flags.py or feature-flags.ts) as string constants — never scatter them as string literals across the codebase. When a flag is removed, a grep for the constant finds every call site. String literals are invisible to static analysis and create silent bugs when a flag key is renamed in the dashboard but not in the code.Flag Lifecycle — Permanent Flags vs Temporary Experiment Flags
Feature flag debt is a real maintenance burden. Flags accumulate in codebases and dashboards, most of them long-since fully rolled out and never cleaned up. The code still evaluates them on every request, the tests still mock them, and new engineers have no idea which flags are still meaningful. The discipline of flag lifecycle management requires treating temporary flags as technical debt with a defined expiry.
Release flags (temporary)
Enable or disable a new feature during rollout. Target lifespan: days to weeks. Once the rollout reaches 100% and the feature is stable, remove the flag from the codebase and the flag dashboard. Leaving a release flag in place after full rollout adds dead evaluation code and dashboard noise.
Experiment flags (temporary)
A/B or multivariate test for a specific hypothesis with a defined end date. Once the winning variant is determined, ship it as the permanent default and remove the flag. Experiment flags should always have a scheduled end date set in the flag dashboard.
Ops/kill-switch flags (permanent)
Emergency circuit breakers for expensive third-party integrations, background jobs, or risky data migrations. These live permanently in the codebase and the dashboard. They are not removed after a rollout — they are there for the next incident. Document them clearly and test them regularly.
Config flags (permanent)
Deliver runtime configuration without redeployment: rate limits, cache TTLs, algorithm parameters, ML model versions. These replace environment variables for values that need to change without a deploy. They are permanent infrastructure, not temporary toggles.
# Tooling to surface stale flags — integrate into your weekly tech-debt rotation
# Queries LaunchDarkly API for flags not modified in 30+ days that are fully rolled out
import os
import httpx
from datetime import datetime, timezone, timedelta
LD_API_KEY = os.environ["LD_API_KEY"]
PROJECT_KEY = "my-project"
ENVIRONMENT = "production"
STALE_DAYS = 30
resp = httpx.get(
f"https://app.launchdarkly.com/api/v2/flags/{PROJECT_KEY}",
headers={"Authorization": LD_API_KEY},
params={"limit": 200, "filter": "query:", "env": ENVIRONMENT},
)
resp.raise_for_status()
flags = resp.json()["items"]
now = datetime.now(timezone.utc)
stale_threshold = now - timedelta(days=STALE_DAYS)
stale_flags = []
for flag in flags:
env_data = flag.get("environments", {}).get(ENVIRONMENT, {})
last_modified_ms = flag.get("_updatedDate", 0)
last_modified = datetime.fromtimestamp(last_modified_ms / 1000, tz=timezone.utc)
# A flag is stale if: fully on (100%), not modified in STALE_DAYS, and temporary type
is_fully_on = env_data.get("on", False) and not env_data.get("rules")
is_old = last_modified < stale_threshold
is_temporary = flag.get("temporary", True)
if is_fully_on and is_old and is_temporary:
stale_flags.append({
"key": flag["key"],
"name": flag["name"],
"last_modified": last_modified.date().isoformat(),
"days_stale": (now - last_modified).days,
})
if stale_flags:
print(f"Found {len(stale_flags)} stale flags to clean up:")
for f in sorted(stale_flags, key=lambda x: -x["days_stale"]):
print(f" [{f['days_stale']}d] {f['key']} — {f['name']}")Performance Considerations — Evaluation Latency and SDK Caching
Server-side SDK evaluation is designed to be sub-millisecond — the SDK maintains an in-memory cache populated by a persistent streaming connection to the flag backend. The cost is the streaming connection (one per SDK instance) and memory (proportional to the number of flags and targeting rules). In practice, a typical LaunchDarkly account with 100 flags uses less than 10 MB of heap for the flag cache.
The three places where flag evaluation performance breaks down are: (1) SDK initialization latency — the first request arrives before the SDK has connected to the streaming API; (2) context serialization overhead — building a rich evaluation context on every hot path; and (3) flag count — evaluating 50 flags per request multiplies the context-building cost.
# Go — production-grade LaunchDarkly integration with connection pooling and fast path
# go get github.com/launchdarkly/go-server-sdk/v7
package features
import (
"context"
"log"
"os"
"time"
ld "github.com/launchdarkly/go-server-sdk/v7"
"github.com/launchdarkly/go-server-sdk/v7/ldcontext"
"github.com/launchdarkly/go-server-sdk/v7/ldcomponents"
)
var client *ld.LDClient
func Init() {
config := ld.Config{
// Stream flag updates from LaunchDarkly
DataSource: ldcomponents.StreamingDataSource(),
// Flush analytics events every 5s, not on every evaluation
Events: ldcomponents.SendEvents().FlushInterval(5 * time.Second),
}
var err error
client, err = ld.MakeCustomClient(os.Getenv("LD_SDK_KEY"), config, 5*time.Second)
if err != nil {
log.Fatalf("LD client failed to initialize: %v", err)
}
log.Println("LaunchDarkly SDK initialized")
}
// EvalContext builds an LDContext for the given request — called once per request,
// then reused for all flag evaluations in that request's handler.
func EvalContext(userID, plan, country string) ldcontext.Context {
return ldcontext.NewBuilder(userID).
SetString("plan", plan).
SetString("country", country).
Build()
}
func IsNewSearchEnabled(ctx ldcontext.Context) bool {
val, err := client.BoolVariation("new-search-engine", ctx, false)
if err != nil {
return false // fail closed
}
return val
}
func GetRankingAlgorithm(ctx ldcontext.Context) string {
val, err := client.StringVariation("ranking-algorithm", ctx, "bm25")
if err != nil {
return "bm25"
}
return val
}
func IsMaintenanceMode() bool {
// Kill switch: no user context required, evaluated against system context
sysCtx := ldcontext.New("system")
val, _ := client.BoolVariation("maintenance-mode", sysCtx, false)
return val
}Feature Flag Production Checklist
SDK is initialised once at startup, not per-request
Initialising the LaunchDarkly or OpenFeature SDK on every HTTP request creates a new streaming connection per request — this will exhaust file descriptors, saturate the flag backend's connection limits, and add 200–500ms latency to every request. Initialise the SDK once during application bootstrap and inject it as a singleton or dependency.
All evaluations have a safe default value
Flag evaluation can fail if the SDK is not yet initialised, the streaming connection is interrupted, or the flag key is misspelled. Every client.variation() call must specify a default value that represents the safe, pre-feature-flag behaviour — typically the existing code path. Never use null or empty string as a default for boolean flags.
Flag keys are centralised string constants
Scatter flag key strings across the codebase and you will eventually have a typo that silently evaluates to the default value with no error. Keep all flag keys in a single constants file. Use those constants everywhere. Static analysis will catch references to removed flags. A string literal in a call site is invisible to search-and-replace.
Temporary flags have a defined removal date in the ticket system
Create a cleanup ticket when you create a release or experiment flag — not after rollout. Schedule it for two weeks after the expected full rollout date. Without a ticket, flags accumulate. Flag debt compounds: each stale flag adds evaluation overhead, test mock complexity, and cognitive load for new engineers.
Kill switches are tested in production regularly
A kill switch that has never been flipped in production is a kill switch that may not work when you need it. Schedule a quarterly chaos exercise: flip each kill switch in production during low-traffic hours and verify that the system degrades gracefully. This also validates that the SDK streaming connection is healthy.
Flag evaluation is instrumented with metrics and traces
Record flag evaluation counts by flag key, variant, and reason (targeting rule, percentage rollout, default). Alert on sudden changes in variant distribution — a flag that was 10% on and jumps to 100% without a deliberate rollout change indicates misconfiguration. Include flag key and variant in distributed trace attributes for correlation during incident response.
Percentage rollout uses user ID bucketing, not session ID
Bucketing by session ID means the same user can get different flag variants across sessions — they see the new checkout flow, then the old one, then the new one. This produces inconsistent experiences and corrupts A/B test results. Always bucket by stable user ID. For anonymous users, use a stable device or anonymous ID persisted in a cookie or local storage.
CI/CD pipeline has a flag-gated deployment step
Use the flag evaluation API in your deployment pipeline: after a canary deploy, query flag metrics and error rate from your monitoring stack; if the error rate for the flagged variation exceeds the threshold, call the kill switch API automatically and page the on-call engineer. Do not rely on manual observation during rollouts.
Work with us
Rolling out features gradually or implementing feature flags across a microservices stack?
We design and implement feature flag infrastructure — from LaunchDarkly and OpenFeature SDK integration and targeting rule design to flagd self-hosted deployments, CI/CD-integrated rollout automation, kill-switch monitoring, and flag lifecycle governance. Let’s talk.
Get in touch