What industries do you work with?

We work across a wide range of industries including finance, healthcare, e-commerce, logistics, and telecommunications. Our solutions are tailored to each client’s specific domain requirements and regulatory environment.

How long does a typical engagement take?

It depends on the scope. A focused observability deployment or automation workflow can be delivered in 4-6 weeks. Larger initiatives like full-scale LLM integration or platform builds typically run 2-4 months. We always start with a discovery phase to align on timelines.

Do you offer ongoing support after project delivery?

Yes. We offer flexible support and maintenance plans to ensure your systems stay healthy, updated, and optimized. We can also embed with your team on a part-time basis for continuous improvement.

Can you work with our existing tech stack?

Absolutely. We integrate with your current infrastructure and tools rather than forcing a rip-and-replace. Whether you’re on AWS, GCP, Azure, or on-prem, we adapt our approach to what works best for your environment.

What is your pricing model?

We offer both fixed-price project engagements and time-and-materials contracts depending on the nature of the work. Reach out through our contact form and we’ll provide a tailored estimate within 24 hours.

How do you handle data security and compliance?

Security is built into every engagement. We follow industry best practices for data handling, support GDPR and SOC 2 compliance requirements, and can work within your existing security policies and access controls.

Feature Flags for Engineers — LaunchDarkly, OpenFeature, and Safe Rollout Patterns

Why feature flags replace the feature branch

The traditional model — develop on a branch, merge when done, deploy when approved — couples code deployment to feature release. A single problematic commit delays the entire release. A rollback reverts every change shipped with it. Feature flags break that coupling. You deploy code continuously but control which users see which behaviour through runtime configuration. The new checkout flow ships to production on Monday but only activates for 1% of users. If error rates spike, you flip a switch. No redeployment. No rollback. Recovery in seconds.

At scale, feature flags enable trunk-based development: every engineer merges to main multiple times a day, and long-running branches disappear entirely. They also separate the concerns of QA, product, and operations — QA tests a feature in production for internal users before it goes public; product managers control a gradual percentage rollout; operations can kill a flag the moment an SLO is breached. Tools like LaunchDarkly and the open standard OpenFeature give engineering teams the infrastructure to do this safely across microservices and polyglot stacks.

Core Concepts — Flag Types, Targeting, and Evaluation

Boolean flag

The simplest flag type: on or off. Used for kill switches, enabling new API endpoints, or toggling entire subsystems. Evaluated at runtime against a targeting rule set. Default state is always off; activation is explicit.

Multivariate flag

Returns one of N string, number, or JSON values instead of a boolean. Used for A/B/n tests (three checkout button colours), configuration delivery (algorithm version, cache TTL, rate limit threshold), and progressive UI experiments where the variation is more than on/off.

Targeting rule

A condition evaluated against the evaluation context (user ID, email, plan, country, request metadata). Rules are evaluated top-down; the first match wins. If no rule matches, the flag falls back to the default variation. Targeting rules enable employee-only betas, company-specific early access, and geo-restricted features.

Percentage rollout

A targeting rule that assigns users deterministically to a variation bucket using a hash of their user ID and the flag key. The same user always gets the same variation — no flickering between requests. You increase the percentage over time: 1% → 5% → 20% → 50% → 100%.

Evaluation context

The structured data passed to the flag evaluation SDK on every call: user ID, anonymous flag, custom attributes (plan, region, beta_enrolled). Rich context enables precise targeting. Anonymous users can be bucketed by session ID or device ID for consistent experiences before login.

LaunchDarkly SDK Integration — Server-Side and Client-Side

LaunchDarkly provides SDKs for over 20 languages and frameworks. Server-side SDKs connect to LaunchDarkly's streaming API, maintain a local in-memory flag cache, and evaluate flags without a network round-trip on every call. The SDK streams flag updates in real time — flag changes propagate to all SDK instances within milliseconds. Client-side SDKs operate differently: they receive a pre-evaluated flag payload for the current user from LaunchDarkly's edge CDN, eliminating server-side latency for frontend applications.

# Python server-side SDK — LaunchDarkly 9.x
# pip install launchdarkly-server-sdk

import ldclient
from ldclient.config import Config
from ldclient import Context

# Initialise once at application startup (not per-request)
ldclient.set_config(Config("sdk-YOUR-SDK-KEY"))
client = ldclient.get()

# Wait for SDK to connect to streaming API and populate local flag cache
if not client.is_initialized():
    raise RuntimeError("LaunchDarkly SDK failed to initialize")


def evaluate_checkout_flag(user_id: str, plan: str, country: str) -> str:
    """
    Returns the checkout flow variation for the given user.
    Evaluation is in-process — no network call, <1ms.
    """
    context = (
        Context.builder(user_id)
        .kind("user")
        .set("plan", plan)
        .set("country", country)
        .build()
    )

    # Boolean flag: is the new checkout enabled for this user?
    new_checkout_enabled: bool = client.variation(
        "new-checkout-flow",   # flag key
        context,
        False,                 # default value if flag is unreachable
    )

    if new_checkout_enabled:
        return "new"

    # Multivariate string flag: which payment gateway variant?
    gateway_variant: str = client.variation(
        "payment-gateway-variant",
        context,
        "stripe-v1",           # default: existing Stripe integration
    )
    return gateway_variant


def get_rate_limit(service_id: str, tier: str) -> int:
    """Deliver rate limit configuration via a multivariate number flag."""
    context = (
        Context.builder(service_id)
        .kind("service")
        .set("tier", tier)
        .build()
    )
    # Default 1000 req/min; flag overrides per tier without redeployment
    return client.variation("api-rate-limit-rps", context, 1000)


# Register a listener to log flag change events
def on_flag_change(flag_key: str) -> None:
    print(f"[LD] Flag updated: {flag_key}")

client.flag_tracker.add_flag_change_listener(on_flag_change)

Note

The LaunchDarkly server-side SDK evaluation is synchronous and in-process— it reads from the local flag cache built from the streaming connection. There is no HTTP call per flag evaluation. The streaming connection is a single long-lived SSE connection from the SDK to LaunchDarkly's servers. Evaluation latency is typically under 100 microseconds.

// TypeScript — LaunchDarkly server-side SDK in an Express service
// npm install @launchdarkly/node-server-sdk

import * as ld from "@launchdarkly/node-server-sdk";
import express from "express";

const app = express();

// Initialise SDK once; the client maintains a persistent streaming connection
const ldClient = ld.init("sdk-YOUR-SDK-KEY");

async function bootstrap() {
  await ldClient.waitForInitialization({ timeout: 5 });
  console.log("LaunchDarkly SDK initialized");
}

bootstrap().catch((err) => {
  console.error("LD SDK failed to initialize:", err);
  process.exit(1);
});

app.get("/api/feature-config", async (req, res) => {
  const userId = req.headers["x-user-id"] as string;
  const plan = req.headers["x-user-plan"] as string ?? "free";

  const context: ld.LDContext = {
    kind: "user",
    key: userId,
    plan,
    anonymous: !userId,
  };

  // Boolean: is the new recommendations engine active?
  const recsEnabled = await ldClient.variation(
    "new-recommendations-engine",
    context,
    false
  );

  // JSON flag: delivery a full config object without redeployment
  const searchConfig = await ldClient.variation(
    "search-config",
    context,
    { maxResults: 10, fuzzy: false, boostRecent: false }
  );

  res.json({ recsEnabled, searchConfig });
});

// Graceful shutdown — flush pending analytics events
process.on("SIGTERM", async () => {
  await ldClient.flush();
  ldClient.close();
});

OpenFeature — The Vendor-Neutral Standard

OpenFeature is a CNCF incubating project that defines a standard API for feature flag evaluation, independent of any specific vendor. You write application code against the OpenFeature SDK; a provider adapts the OpenFeature interface to a specific backend — LaunchDarkly, Flagsmith, Unleash, CloudBees, or your own in-house flag service. Switching providers requires changing one line of configuration, not rewriting all your flag evaluation calls.

OpenFeature also defines hooks — middleware that runs before and after flag evaluation — enabling cross-cutting concerns like telemetry, logging, and validation without modifying flag call sites. The standard is gaining rapid adoption as the industry moves toward portability between flag backends.

# Python — OpenFeature with LaunchDarkly provider
# pip install openfeature-sdk openfeature-provider-launchdarkly

from openfeature import api
from openfeature.evaluation_context import EvaluationContext
from openfeature_provider_launchdarkly import LaunchDarklyProvider

# Register the provider once at startup
# Swap this line to change vendors — all evaluation code stays unchanged
api.set_provider(LaunchDarklyProvider(sdk_key="sdk-YOUR-SDK-KEY"))

client = api.get_client()


def is_new_dashboard_enabled(user_id: str, company: str) -> bool:
    ctx = EvaluationContext(
        targeting_key=user_id,
        attributes={"company": company, "beta_enrolled": True},
    )
    # OpenFeature API — identical regardless of which provider is registered
    return client.get_boolean_value(
        flag_key="new-dashboard",
        default_value=False,
        evaluation_context=ctx,
    )


def get_algorithm_version(user_id: str) -> str:
    ctx = EvaluationContext(targeting_key=user_id)
    return client.get_string_value(
        flag_key="ranking-algorithm",
        default_value="v1",
        evaluation_context=ctx,
    )


def get_cache_config(service: str) -> dict:
    ctx = EvaluationContext(targeting_key=service)
    return client.get_object_value(
        flag_key="cache-config",
        default_value={"ttl_seconds": 300, "max_size_mb": 512},
        evaluation_context=ctx,
    )

// TypeScript — OpenFeature with in-process Flagd provider
// Flagd is a CNCF open-source flag daemon you self-host
// npm install @openfeature/server-sdk @openfeature/flagd-provider

import { OpenFeature } from "@openfeature/server-sdk";
import { FlagdProvider } from "@openfeature/flagd-provider";

// Switch between providers without touching evaluation code
OpenFeature.setProvider(
  new FlagdProvider({
    host: "flagd.internal",
    port: 8013,
    tls: false,
  })
);

const featureClient = OpenFeature.getClient();

// Hook: log every flag evaluation for audit trail
OpenFeature.addHooks({
  after(hookContext, evaluationDetails) {
    console.log({
      flag: hookContext.flagKey,
      user: hookContext.context.targetingKey,
      value: evaluationDetails.value,
      reason: evaluationDetails.reason,
    });
  },
  error(hookContext, err) {
    console.error(`Flag evaluation error [${hookContext.flagKey}]:`, err);
  },
});

export async function resolveUserFlags(userId: string, plan: string) {
  const ctx = { targetingKey: userId, plan };

  const [darkModeEnabled, maxUploadMb, exportFormat] = await Promise.all([
    featureClient.getBooleanValue("dark-mode", false, ctx),
    featureClient.getNumberValue("max-upload-size-mb", 25, ctx),
    featureClient.getStringValue("export-format", "csv", ctx),
  ]);

  return { darkModeEnabled, maxUploadMb, exportFormat };
}

Self-Hosted Flags with Flagd — Zero-Dependency Flag Evaluation

Flagd is the CNCF reference implementation of a flag evaluation daemon — a lightweight Go binary that serves the OpenFeature remote evaluation protocol. It reads flag configuration from a JSON or YAML file (which can be stored in a ConfigMap and synced from Git), exposes a gRPC and HTTP API, and supports in-process evaluation with no external network dependency. For teams that cannot send evaluation data to a SaaS vendor, flagd provides a fully self-contained flag infrastructure.

# flagd/flags.json — flag configuration file (stored in Git, mounted as ConfigMap)
# flagd reads this file and hot-reloads on change

{
  "$schema": "https://flagd.dev/schema/v0/flags.json",
  "flags": {

    "new-checkout-flow": {
      "state": "ENABLED",
      "variants": { "on": true, "off": false },
      "defaultVariant": "off",
      "targeting": {
        "if": [
          {
            "in": [{ "var": "company" }, ["acme", "globex", "initech"]]
          },
          "on",
          {
            "<=": [{ "var": "percentileHash" }, 10]
          },
          "on",
          "off"
        ]
      }
    },

    "payment-gateway-variant": {
      "state": "ENABLED",
      "variants": {
        "stripe-v1": "stripe-v1",
        "stripe-v2": "stripe-v2",
        "adyen": "adyen"
      },
      "defaultVariant": "stripe-v1",
      "targeting": {
        "if": [
          { "==": [{ "var": "plan" }, "enterprise"] },
          "adyen",
          {
            "<=": [{ "var": "percentileHash" }, 25]
          },
          "stripe-v2",
          "stripe-v1"
        ]
      }
    },

    "maintenance-mode": {
      "state": "ENABLED",
      "variants": { "on": true, "off": false },
      "defaultVariant": "off"
    }

  }
}

# kubernetes/flagd-deployment.yaml — run flagd as a sidecar or standalone service

apiVersion: apps/v1
kind: Deployment
metadata:
  name: flagd
  namespace: platform
spec:
  replicas: 2
  selector:
    matchLabels: { app: flagd }
  template:
    metadata:
      labels: { app: flagd }
    spec:
      containers:
        - name: flagd
          image: ghcr.io/open-feature/flagd:v0.10.2
          args:
            - start
            - --uri
            - file:/etc/flagd/flags.json
            - --metrics-exporter
            - otel
          ports:
            - { name: grpc, containerPort: 8013 }
            - { name: http, containerPort: 8016 }
            - { name: metrics, containerPort: 8014 }
          volumeMounts:
            - name: flags-config
              mountPath: /etc/flagd
          readinessProbe:
            httpGet: { path: /healthz, port: 8016 }
            initialDelaySeconds: 3
          resources:
            requests: { cpu: "100m", memory: "64Mi" }
            limits:   { cpu: "500m", memory: "256Mi" }
      volumes:
        - name: flags-config
          configMap:
            name: flagd-flags
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: flagd-flags
  namespace: platform
data:
  flags.json: |
    { "$schema": "https://flagd.dev/schema/v0/flags.json", "flags": {} }

Note

Mount flagd as a sidecar alongside your application pods for lowest latency (loopback, no network hop), or deploy it as a standalone ClusterIP service for simpler configuration management. The sidecar pattern eliminates external dependencies for flag evaluation at the cost of one extra container per pod; the service pattern centralises flag cache management at the cost of an inter-pod network call.

Safe Rollout Patterns — Canary, Percentage, and Kill Switch

Feature flags enable a layered rollout strategy that is far safer than deploying code to all users simultaneously. The three core patterns are the internal canary (enable for employees first), percentage rollout (expand the ring of users over time while monitoring SLOs), and the kill switch (instant off, available at all times regardless of rollout progress). Used together, they give you continuous deployment with the risk profile of a manual approval gate.

# Python — wrapping a flag evaluation in a service layer with SLO-aware rollout logic
#
# Pattern: flags are evaluated centrally; business logic never calls the LD SDK directly.
# This makes mocking trivial in tests and keeps flag keys in one place.

from dataclasses import dataclass
from openfeature import api
from openfeature.evaluation_context import EvaluationContext
import prometheus_client as prom

# Metrics for SLO monitoring — alert on error_rate > threshold before expanding rollout
FLAG_EVALUATIONS = prom.Counter(
    "feature_flag_evaluations_total", "Flag evaluations", ["flag", "variant", "reason"]
)
FLAG_ERRORS = prom.Counter(
    "feature_flag_errors_total", "Flag evaluation errors", ["flag"]
)


@dataclass(frozen=True)
class UserContext:
    user_id: str
    plan: str
    country: str
    beta_enrolled: bool = False
    internal: bool = False


class FeatureService:
    def __init__(self):
        self._client = api.get_client()

    def _ctx(self, user: UserContext) -> EvaluationContext:
        return EvaluationContext(
            targeting_key=user.user_id,
            attributes={
                "plan": user.plan,
                "country": user.country,
                "beta_enrolled": user.beta_enrolled,
                "internal": user.internal,
            },
        )

    def is_new_checkout_enabled(self, user: UserContext) -> bool:
        try:
            details = self._client.get_boolean_details(
                "new-checkout-flow", False, self._ctx(user)
            )
            FLAG_EVALUATIONS.labels(
                flag="new-checkout-flow",
                variant=str(details.value),
                reason=details.reason or "UNKNOWN",
            ).inc()
            return details.value
        except Exception:
            FLAG_ERRORS.labels(flag="new-checkout-flow").inc()
            return False          # fail closed — return safe default

    def is_maintenance_mode(self) -> bool:
        """Kill switch: evaluated without user context — applies globally."""
        try:
            return self._client.get_boolean_value(
                "maintenance-mode", False, EvaluationContext(targeting_key="system")
            )
        except Exception:
            return False

# Rollout strategy implementation: orchestrated via CI/CD + LaunchDarkly API
# Run this script from your deployment pipeline to advance the rollout ring

import os
import httpx

LD_API_KEY = os.environ["LD_API_KEY"]
PROJECT_KEY = "my-project"
ENVIRONMENT = "production"
FLAG_KEY = "new-checkout-flow"

BASE_URL = f"https://app.launchdarkly.com/api/v2/flags/{PROJECT_KEY}/{FLAG_KEY}"
HEADERS = {"Authorization": LD_API_KEY, "Content-Type": "application/json"}


def set_percentage_rollout(percentage: int) -> None:
    """
    Advance the percentage rollout for new-checkout-flow.
    Called from CI/CD after each healthy canary window.
    """
    patch = [
        {
            "op": "replace",
            "path": f"/environments/{ENVIRONMENT}/rules/0/variation",
            "value": {
                "rollout": {
                    "variations": [
                        {"variation": 0, "weight": percentage * 1000},      # true
                        {"variation": 1, "weight": (100 - percentage) * 1000},  # false
                    ],
                    "bucketBy": "key",    # deterministic by user ID
                }
            },
        }
    ]
    resp = httpx.patch(BASE_URL, json=patch, headers=HEADERS)
    resp.raise_for_status()
    print(f"Rollout advanced to {percentage}%")


def kill_switch_off() -> None:
    """Immediately disable the flag for all users — used by incident response."""
    patch = [
        {
            "op": "replace",
            "path": f"/environments/{ENVIRONMENT}/on",
            "value": False,
        }
    ]
    resp = httpx.patch(BASE_URL, json=patch, headers=HEADERS)
    resp.raise_for_status()
    print("Kill switch activated — flag is OFF for all users")


# Typical rollout progression:
# set_percentage_rollout(1)    # Day 0: 1% canary — monitor for 1h
# set_percentage_rollout(5)    # Day 1: 5% — monitor for 4h
# set_percentage_rollout(25)   # Day 2: 25%
# set_percentage_rollout(50)   # Day 3: 50%
# set_percentage_rollout(100)  # Day 5: full rollout

Testing Strategies — Mocking Flags in Unit and Integration Tests

The cardinal rule of testing with feature flags is: never hit a real flag backend in unit tests. Tests that depend on LaunchDarkly's streaming API are slow, flaky, and consume quota. Instead, inject a mock provider or override flag values in the test environment. Both LaunchDarkly and OpenFeature have first-class support for this pattern.

# Python — testing with OpenFeature's in-memory provider
# pip install openfeature-sdk

import pytest
from openfeature import api
from openfeature.provider.in_memory_provider import InMemoryProvider, InMemoryFlag
from openfeature.evaluation_context import EvaluationContext
from myapp.features import FeatureService
from myapp.models import UserContext


@pytest.fixture(autouse=True)
def patch_feature_flags():
    """Replace the real LaunchDarkly provider with an in-memory stub for all tests."""
    provider = InMemoryProvider(
        flags={
            "new-checkout-flow": InMemoryFlag(
                default_variant="off",
                variants={"on": True, "off": False},
                enabled=True,
            ),
            "payment-gateway-variant": InMemoryFlag(
                default_variant="stripe-v1",
                variants={
                    "stripe-v1": "stripe-v1",
                    "stripe-v2": "stripe-v2",
                    "adyen": "adyen",
                },
                enabled=True,
            ),
            "maintenance-mode": InMemoryFlag(
                default_variant="off",
                variants={"on": True, "off": False},
                enabled=True,
            ),
        }
    )
    api.set_provider(provider)
    yield provider


def test_new_checkout_disabled_by_default():
    svc = FeatureService()
    user = UserContext(user_id="u1", plan="free", country="US")
    assert svc.is_new_checkout_enabled(user) is False


def test_new_checkout_enabled_for_beta_user(patch_feature_flags):
    # Override a single flag to "on" variant for this specific test
    patch_feature_flags.flags["new-checkout-flow"].default_variant = "on"
    svc = FeatureService()
    user = UserContext(user_id="u2", plan="pro", country="GB", beta_enrolled=True)
    assert svc.is_new_checkout_enabled(user) is True


def test_maintenance_mode_kill_switch(patch_feature_flags):
    patch_feature_flags.flags["maintenance-mode"].default_variant = "on"
    svc = FeatureService()
    assert svc.is_maintenance_mode() is True


def test_gateway_variant_default():
    svc = FeatureService()
    from openfeature.evaluation_context import EvaluationContext
    ctx = EvaluationContext(targeting_key="u3", attributes={"plan": "free"})
    result = svc._client.get_string_value("payment-gateway-variant", "stripe-v1", ctx)
    assert result == "stripe-v1"

Note

Keep flag keys in a single module (flags.py or feature-flags.ts) as string constants — never scatter them as string literals across the codebase. When a flag is removed, a grep for the constant finds every call site. String literals are invisible to static analysis and create silent bugs when a flag key is renamed in the dashboard but not in the code.

Flag Lifecycle — Permanent Flags vs Temporary Experiment Flags

Feature flag debt is a real maintenance burden. Flags accumulate in codebases and dashboards, most of them long-since fully rolled out and never cleaned up. The code still evaluates them on every request, the tests still mock them, and new engineers have no idea which flags are still meaningful. The discipline of flag lifecycle management requires treating temporary flags as technical debt with a defined expiry.

Release flags (temporary)

Enable or disable a new feature during rollout. Target lifespan: days to weeks. Once the rollout reaches 100% and the feature is stable, remove the flag from the codebase and the flag dashboard. Leaving a release flag in place after full rollout adds dead evaluation code and dashboard noise.

Experiment flags (temporary)

A/B or multivariate test for a specific hypothesis with a defined end date. Once the winning variant is determined, ship it as the permanent default and remove the flag. Experiment flags should always have a scheduled end date set in the flag dashboard.

Ops/kill-switch flags (permanent)

Emergency circuit breakers for expensive third-party integrations, background jobs, or risky data migrations. These live permanently in the codebase and the dashboard. They are not removed after a rollout — they are there for the next incident. Document them clearly and test them regularly.

Config flags (permanent)

Deliver runtime configuration without redeployment: rate limits, cache TTLs, algorithm parameters, ML model versions. These replace environment variables for values that need to change without a deploy. They are permanent infrastructure, not temporary toggles.

# Tooling to surface stale flags — integrate into your weekly tech-debt rotation
# Queries LaunchDarkly API for flags not modified in 30+ days that are fully rolled out

import os
import httpx
from datetime import datetime, timezone, timedelta

LD_API_KEY = os.environ["LD_API_KEY"]
PROJECT_KEY = "my-project"
ENVIRONMENT = "production"
STALE_DAYS = 30

resp = httpx.get(
    f"https://app.launchdarkly.com/api/v2/flags/{PROJECT_KEY}",
    headers={"Authorization": LD_API_KEY},
    params={"limit": 200, "filter": "query:", "env": ENVIRONMENT},
)
resp.raise_for_status()
flags = resp.json()["items"]

now = datetime.now(timezone.utc)
stale_threshold = now - timedelta(days=STALE_DAYS)

stale_flags = []
for flag in flags:
    env_data = flag.get("environments", {}).get(ENVIRONMENT, {})
    last_modified_ms = flag.get("_updatedDate", 0)
    last_modified = datetime.fromtimestamp(last_modified_ms / 1000, tz=timezone.utc)

    # A flag is stale if: fully on (100%), not modified in STALE_DAYS, and temporary type
    is_fully_on = env_data.get("on", False) and not env_data.get("rules")
    is_old = last_modified < stale_threshold
    is_temporary = flag.get("temporary", True)

    if is_fully_on and is_old and is_temporary:
        stale_flags.append({
            "key": flag["key"],
            "name": flag["name"],
            "last_modified": last_modified.date().isoformat(),
            "days_stale": (now - last_modified).days,
        })

if stale_flags:
    print(f"Found {len(stale_flags)} stale flags to clean up:")
    for f in sorted(stale_flags, key=lambda x: -x["days_stale"]):
        print(f"  [{f['days_stale']}d] {f['key']} — {f['name']}")

Performance Considerations — Evaluation Latency and SDK Caching

Server-side SDK evaluation is designed to be sub-millisecond — the SDK maintains an in-memory cache populated by a persistent streaming connection to the flag backend. The cost is the streaming connection (one per SDK instance) and memory (proportional to the number of flags and targeting rules). In practice, a typical LaunchDarkly account with 100 flags uses less than 10 MB of heap for the flag cache.

The three places where flag evaluation performance breaks down are: (1) SDK initialization latency — the first request arrives before the SDK has connected to the streaming API; (2) context serialization overhead — building a rich evaluation context on every hot path; and (3) flag count — evaluating 50 flags per request multiplies the context-building cost.

# Go — production-grade LaunchDarkly integration with connection pooling and fast path
# go get github.com/launchdarkly/go-server-sdk/v7

package features

import (
    "context"
    "log"
    "os"
    "time"

    ld "github.com/launchdarkly/go-server-sdk/v7"
    "github.com/launchdarkly/go-server-sdk/v7/ldcontext"
    "github.com/launchdarkly/go-server-sdk/v7/ldcomponents"
)

var client *ld.LDClient

func Init() {
    config := ld.Config{
        // Stream flag updates from LaunchDarkly
        DataSource: ldcomponents.StreamingDataSource(),
        // Flush analytics events every 5s, not on every evaluation
        Events: ldcomponents.SendEvents().FlushInterval(5 * time.Second),
    }

    var err error
    client, err = ld.MakeCustomClient(os.Getenv("LD_SDK_KEY"), config, 5*time.Second)
    if err != nil {
        log.Fatalf("LD client failed to initialize: %v", err)
    }
    log.Println("LaunchDarkly SDK initialized")
}

// EvalContext builds an LDContext for the given request — called once per request,
// then reused for all flag evaluations in that request's handler.
func EvalContext(userID, plan, country string) ldcontext.Context {
    return ldcontext.NewBuilder(userID).
        SetString("plan", plan).
        SetString("country", country).
        Build()
}

func IsNewSearchEnabled(ctx ldcontext.Context) bool {
    val, err := client.BoolVariation("new-search-engine", ctx, false)
    if err != nil {
        return false // fail closed
    }
    return val
}

func GetRankingAlgorithm(ctx ldcontext.Context) string {
    val, err := client.StringVariation("ranking-algorithm", ctx, "bm25")
    if err != nil {
        return "bm25"
    }
    return val
}

func IsMaintenanceMode() bool {
    // Kill switch: no user context required, evaluated against system context
    sysCtx := ldcontext.New("system")
    val, _ := client.BoolVariation("maintenance-mode", sysCtx, false)
    return val
}

Feature Flag Production Checklist

SDK is initialised once at startup, not per-request

Initialising the LaunchDarkly or OpenFeature SDK on every HTTP request creates a new streaming connection per request — this will exhaust file descriptors, saturate the flag backend's connection limits, and add 200–500ms latency to every request. Initialise the SDK once during application bootstrap and inject it as a singleton or dependency.

All evaluations have a safe default value

Flag evaluation can fail if the SDK is not yet initialised, the streaming connection is interrupted, or the flag key is misspelled. Every client.variation() call must specify a default value that represents the safe, pre-feature-flag behaviour — typically the existing code path. Never use null or empty string as a default for boolean flags.

Flag keys are centralised string constants

Scatter flag key strings across the codebase and you will eventually have a typo that silently evaluates to the default value with no error. Keep all flag keys in a single constants file. Use those constants everywhere. Static analysis will catch references to removed flags. A string literal in a call site is invisible to search-and-replace.

Temporary flags have a defined removal date in the ticket system

Create a cleanup ticket when you create a release or experiment flag — not after rollout. Schedule it for two weeks after the expected full rollout date. Without a ticket, flags accumulate. Flag debt compounds: each stale flag adds evaluation overhead, test mock complexity, and cognitive load for new engineers.

Kill switches are tested in production regularly

A kill switch that has never been flipped in production is a kill switch that may not work when you need it. Schedule a quarterly chaos exercise: flip each kill switch in production during low-traffic hours and verify that the system degrades gracefully. This also validates that the SDK streaming connection is healthy.

Flag evaluation is instrumented with metrics and traces

Record flag evaluation counts by flag key, variant, and reason (targeting rule, percentage rollout, default). Alert on sudden changes in variant distribution — a flag that was 10% on and jumps to 100% without a deliberate rollout change indicates misconfiguration. Include flag key and variant in distributed trace attributes for correlation during incident response.

Percentage rollout uses user ID bucketing, not session ID

Bucketing by session ID means the same user can get different flag variants across sessions — they see the new checkout flow, then the old one, then the new one. This produces inconsistent experiences and corrupts A/B test results. Always bucket by stable user ID. For anonymous users, use a stable device or anonymous ID persisted in a cookie or local storage.

CI/CD pipeline has a flag-gated deployment step

Use the flag evaluation API in your deployment pipeline: after a canary deploy, query flag metrics and error rate from your monitoring stack; if the error rate for the flagged variation exceeds the threshold, call the kill switch API automatically and page the on-call engineer. Do not rely on manual observation during rollouts.

Work with us

Rolling out features gradually or implementing feature flags across a microservices stack?

We design and implement feature flag infrastructure — from LaunchDarkly and OpenFeature SDK integration and targeting rule design to flagd self-hosted deployments, CI/CD-integrated rollout automation, kill-switch monitoring, and flag lifecycle governance. Let’s talk.

Get in touch

Feature Flags for Engineers — LaunchDarkly, OpenFeature, and Safe Rollout Patterns

Why feature flags replace the feature branch

Core Concepts — Flag Types, Targeting, and Evaluation

Boolean flag

Multivariate flag

Targeting rule

Percentage rollout

Evaluation context

LaunchDarkly SDK Integration — Server-Side and Client-Side

OpenFeature — The Vendor-Neutral Standard

Self-Hosted Flags with Flagd — Zero-Dependency Flag Evaluation

Safe Rollout Patterns — Canary, Percentage, and Kill Switch

Testing Strategies — Mocking Flags in Unit and Integration Tests

Flag Lifecycle — Permanent Flags vs Temporary Experiment Flags

Release flags (temporary)

Experiment flags (temporary)

Ops/kill-switch flags (permanent)

Config flags (permanent)

Performance Considerations — Evaluation Latency and SDK Caching

Feature Flag Production Checklist

Rolling out features gradually or implementing feature flags across a microservices stack?

Related Articles