Feature Flags

TL;DR

Feature flags decouple deployment from release, letting you ship code to production disabled and enable it gradually. They enable trunk-based development, safe rollouts, and experimentation. But they add complexity - plan for flag lifecycle management and cleanup.

Why Feature Flags?

Deployment vs. Release

Traditional:
Deploy = Release = Risk

With Feature Flags:
┌─────────────────────────────────────────────────────────────────┐
│  Code deployed to production (behind flag = OFF)                 │
│                                                                 │
│  Time T1: Flag OFF → Feature invisible to users                 │
│  Time T2: Flag ON for 1% → Test with small group               │
│  Time T3: Flag ON for 10% → Expand gradually                   │
│  Time T4: Flag ON for 100% → Full release                      │
│  Time T5: Remove flag → Clean up code                          │
│                                                                 │
│  At any point: Turn flag OFF → Instant rollback                 │
└─────────────────────────────────────────────────────────────────┘

Separation:
- Deploy: Code goes to production (low risk)
- Release: Feature enabled for users (controlled)
- Rollback: Toggle flag (instant, no redeploy)

Types of Feature Flags

Release Flags (Short-lived)

python

# Gradually roll out new feature
if feature_flags.is_enabled("new_checkout_flow", user_id):
    return new_checkout_flow(cart)
else:
    return old_checkout_flow(cart)

# Lifecycle:
# 1. Deploy with flag OFF
# 2. Enable for internal users
# 3. Enable for 1%, 10%, 50%, 100%
# 4. Remove flag, delete old code

Experiment Flags (Temporary)

python

# A/B test different variants
variant = feature_flags.get_variant("checkout_button_color", user_id)

if variant == "control":
    button_color = "blue"
elif variant == "variant_a":
    button_color = "green"
elif variant == "variant_b":
    button_color = "red"

# Track conversion
analytics.track("checkout_completed", {
    "experiment": "checkout_button_color",
    "variant": variant
})

# Lifecycle:
# 1. Run experiment for statistical significance
# 2. Analyze results
# 3. Pick winner, remove flag

Ops Flags (Long-lived)

python

# Circuit breaker / kill switch
if feature_flags.is_enabled("enable_recommendations_service"):
    recommendations = recommendations_service.get(user_id)
else:
    recommendations = []  # Graceful degradation

# Lifecycle: Long-lived, used for operational control

Permission Flags (Long-lived)

python

# Entitlements / premium features
if feature_flags.is_enabled("premium_analytics", user_id):
    show_advanced_analytics()
else:
    show_upgrade_prompt()

# Lifecycle: Long-lived, tied to business logic

Flag Evaluation

Simple Boolean

python

class SimpleFlag:
    def __init__(self, name: str, enabled: bool):
        self.name = name
        self.enabled = enabled
    
    def is_enabled(self) -> bool:
        return self.enabled

Percentage Rollout

python

import hashlib

class PercentageFlag:
    def __init__(self, name: str, percentage: int):
        self.name = name
        self.percentage = percentage  # 0-100
    
    def is_enabled(self, user_id: str) -> bool:
        # Consistent hashing: same user always gets same result
        hash_input = f"{self.name}:{user_id}"
        hash_value = int(hashlib.md5(hash_input.encode()).hexdigest(), 16)
        bucket = hash_value % 100
        return bucket < self.percentage

# 10% rollout
flag = PercentageFlag("new_feature", 10)
flag.is_enabled("user_123")  # True or False (consistent for this user)

Targeting Rules

python

class TargetedFlag:
    def __init__(self, name: str, rules: list):
        self.name = name
        self.rules = rules  # Evaluated in order
    
    def is_enabled(self, context: dict) -> bool:
        for rule in self.rules:
            if rule.matches(context):
                return rule.result
        return False  # Default

# Example rules
rules = [
    # Rule 1: Internal users always on
    Rule(
        condition=lambda ctx: ctx.get("email", "").endswith("@company.com"),
        result=True
    ),
    # Rule 2: Beta users
    Rule(
        condition=lambda ctx: ctx.get("user_id") in beta_user_list,
        result=True
    ),
    # Rule 3: 10% of remaining users
    Rule(
        condition=lambda ctx: percentage_check(ctx.get("user_id"), 10),
        result=True
    ),
    # Rule 4: Default off
    Rule(
        condition=lambda ctx: True,
        result=False
    )
]

Multivariate Flags

python

class MultivariateFlag:
    def __init__(self, name: str, variants: list):
        self.name = name
        self.variants = variants  # [("control", 50), ("variant_a", 25), ("variant_b", 25)]
    
    def get_variant(self, user_id: str) -> str:
        hash_input = f"{self.name}:{user_id}"
        hash_value = int(hashlib.md5(hash_input.encode()).hexdigest(), 16)
        bucket = hash_value % 100
        
        cumulative = 0
        for variant_name, percentage in self.variants:
            cumulative += percentage
            if bucket < cumulative:
                return variant_name
        
        return self.variants[0][0]  # Default to first

Implementation Architecture

Client-Side Evaluation

┌─────────────────────────────────────────────────────────────────┐
│                         Application                              │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                    Feature Flag SDK                      │   │
│  │                                                          │   │
│  │  ┌──────────────┐    ┌──────────────────────────────┐   │   │
│  │  │    Cache     │    │    Evaluation Engine         │   │   │
│  │  │              │    │                              │   │   │
│  │  │  All flag    │───►│  1. Load flag config         │   │   │
│  │  │  configs     │    │  2. Evaluate rules           │   │   │
│  │  │              │    │  3. Return result            │   │   │
│  │  └──────────────┘    └──────────────────────────────┘   │   │
│  │         ▲                                                │   │
│  │         │ Sync (polling or streaming)                    │   │
│  └─────────┼────────────────────────────────────────────────┘   │
│            │                                                     │
└────────────┼─────────────────────────────────────────────────────┘
             │
             │
┌────────────▼─────────────────────────────────────────────────────┐
│                    Feature Flag Service                          │
│                                                                 │
│  ┌───────────────┐    ┌───────────────┐    ┌───────────────┐   │
│  │   Dashboard   │    │     API       │    │   Database    │   │
│  │               │    │               │    │               │   │
│  │  - Create     │───►│  - CRUD flags │───►│  - Flag       │   │
│  │  - Edit       │    │  - Stream     │    │    configs    │   │
│  │  - Toggle     │    │    updates    │    │  - Audit log  │   │
│  └───────────────┘    └───────────────┘    └───────────────┘   │
└─────────────────────────────────────────────────────────────────┘

Pros: Low latency, works offline
Cons: All flags sent to client (size), sync delay

Server-Side Evaluation

┌─────────────────────────────────────────────────────────────────┐
│                         Application                              │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                    Feature Flag SDK                      │   │
│  │                                                          │   │
│  │  is_enabled("feature_x", user_context)                  │   │
│  │              │                                           │   │
│  │              │ HTTP/gRPC                                 │   │
│  │              ▼                                           │   │
│  └──────────────┼───────────────────────────────────────────┘   │
│                 │                                                │
└─────────────────┼────────────────────────────────────────────────┘
                  │
                  │
┌─────────────────▼────────────────────────────────────────────────┐
│                    Feature Flag Service                          │
│                                                                 │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │                  Evaluation Engine                         │  │
│  │                                                            │  │
│  │  1. Receive context (user_id, attributes)                 │  │
│  │  2. Load flag config                                       │  │
│  │  3. Evaluate rules                                         │  │
│  │  4. Return result                                          │  │
│  └───────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Pros: Sensitive rules stay server-side, always fresh
Cons: Latency, network dependency

Hybrid Approach

python

class HybridFeatureFlags:
    def __init__(self):
        self.local_cache = {}
        self.evaluation_service = FeatureFlagService()
    
    def is_enabled(self, flag_name: str, context: dict) -> bool:
        # Check local cache first
        cached = self.local_cache.get(flag_name)
        if cached and not cached.requires_server_evaluation:
            return cached.evaluate(context)
        
        # Fall back to server for complex rules
        return self.evaluation_service.evaluate(flag_name, context)
    
    def sync_flags(self):
        """Background sync of flags that can be evaluated locally"""
        simple_flags = self.evaluation_service.get_all_simple_flags()
        self.local_cache.update(simple_flags)

SDK Implementation

Python SDK Example

python

import requests
import threading
import time
from typing import Optional, Dict, Any

class FeatureFlagClient:
    def __init__(self, sdk_key: str, base_url: str = "https://flags.example.com"):
        self.sdk_key = sdk_key
        self.base_url = base_url
        self.flags: Dict[str, Any] = {}
        self._start_polling()
    
    def _start_polling(self):
        def poll():
            while True:
                try:
                    self._fetch_flags()
                except Exception as e:
                    print(f"Failed to fetch flags: {e}")
                time.sleep(30)  # Poll every 30 seconds
        
        thread = threading.Thread(target=poll, daemon=True)
        thread.start()
    
    def _fetch_flags(self):
        response = requests.get(
            f"{self.base_url}/api/flags",
            headers={"Authorization": f"Bearer {self.sdk_key}"}
        )
        response.raise_for_status()
        self.flags = response.json()
    
    def is_enabled(
        self, 
        flag_key: str, 
        user_id: Optional[str] = None,
        attributes: Optional[Dict] = None,
        default: bool = False
    ) -> bool:
        flag = self.flags.get(flag_key)
        if not flag:
            return default
        
        return self._evaluate(flag, user_id, attributes or {})
    
    def _evaluate(self, flag: dict, user_id: str, attributes: dict) -> bool:
        if not flag.get("enabled"):
            return False
        
        # Check targeting rules
        for rule in flag.get("rules", []):
            if self._matches_rule(rule, user_id, attributes):
                return rule.get("result", False)
        
        # Percentage rollout
        if percentage := flag.get("percentage"):
            return self._percentage_check(flag["key"], user_id, percentage)
        
        return flag.get("default", False)

# Usage
flags = FeatureFlagClient(sdk_key="sdk-key-123")

if flags.is_enabled("new_checkout", user_id="user_123"):
    show_new_checkout()
else:
    show_old_checkout()

Best Practices

Flag Naming Conventions

python

# Good names - descriptive, consistent
"enable_new_checkout_flow"
"experiment_homepage_hero_variant"
"ops_circuit_breaker_recommendations"
"permission_premium_analytics"

# Bad names
"flag1"
"test"
"johns_feature"
"temporary_fix_delete_later"  # It won't be deleted

Flag Lifecycle Management

python

class FlagLifecycle:
    """Track flag status and enforce cleanup"""
    
    STATES = ["planning", "development", "testing", "rollout", "complete", "cleanup"]
    
    def __init__(self, flag_name: str):
        self.flag_name = flag_name
        self.state = "planning"
        self.created_at = datetime.now()
        self.owner = None
        self.cleanup_deadline = None
    
    def transition(self, new_state: str):
        if new_state == "rollout":
            # Set cleanup deadline when rollout starts
            self.cleanup_deadline = datetime.now() + timedelta(days=30)
        self.state = new_state
    
    def is_overdue_for_cleanup(self) -> bool:
        if self.state in ["complete", "cleanup"]:
            return datetime.now() > self.cleanup_deadline
        return False

# Automated cleanup reminders
def send_cleanup_reminders():
    for flag in get_all_flags():
        if flag.is_overdue_for_cleanup():
            send_reminder(
                to=flag.owner,
                subject=f"Feature flag '{flag.flag_name}' needs cleanup",
                body=f"Flag has been at 100% for over 30 days. Please remove."
            )

Avoid Flag Debt

python

# BAD: Nested flags (hard to reason about)
if flags.is_enabled("feature_a"):
    if flags.is_enabled("feature_b"):
        if flags.is_enabled("feature_c"):
            do_something()

# BETTER: Single flag with clear intent
if flags.is_enabled("feature_abc_combined"):
    do_something()

# BAD: Flag in shared code (affects everything)
def get_price(product):
    price = product.base_price
    if flags.is_enabled("new_pricing"):  # Too broad!
        price = calculate_new_price(product)
    return price

# BETTER: Specific scope
def get_price(product, context):
    if context.feature == "checkout" and flags.is_enabled("new_pricing_checkout"):
        return calculate_new_price(product)
    return product.base_price

Testing with Flags

python

import pytest
from unittest.mock import patch

class TestCheckoutWithFlags:
    def test_new_checkout_enabled(self):
        with patch('app.flags.is_enabled', return_value=True):
            result = process_checkout(cart)
            assert result.used_new_flow == True
    
    def test_new_checkout_disabled(self):
        with patch('app.flags.is_enabled', return_value=False):
            result = process_checkout(cart)
            assert result.used_new_flow == False
    
    def test_both_flows_produce_same_result(self):
        """Ensure new and old flow are functionally equivalent"""
        cart = create_test_cart()
        
        with patch('app.flags.is_enabled', return_value=False):
            old_result = process_checkout(cart)
        
        with patch('app.flags.is_enabled', return_value=True):
            new_result = process_checkout(cart)
        
        assert old_result.total == new_result.total
        assert old_result.items == new_result.items

Feature Flag Services

LaunchDarkly

python

import ldclient
from ldclient.config import Config

ldclient.set_config(Config("sdk-key-123"))
client = ldclient.get()

user = {
    "key": "user-123",
    "email": "user@example.com",
    "custom": {
        "plan": "premium",
        "country": "US"
    }
}

# Boolean flag
show_feature = client.variation("new-feature", user, False)

# Multivariate flag
button_color = client.variation("button-color", user, "blue")

Unleash (Open Source)

python

from UnleashClient import UnleashClient

client = UnleashClient(
    url="https://unleash.example.com/api",
    app_name="my-app",
    custom_headers={"Authorization": "token"}
)
client.initialize_client()

# Check flag
if client.is_enabled("new-feature", context={"userId": "123"}):
    show_new_feature()

# With fallback
enabled = client.is_enabled("new-feature", fallback_function=lambda: False)

Build vs. Buy

Build your own when:
- Simple use case (boolean flags only)
- Privacy/compliance requirements
- Tight budget
- Learning/control priority

Buy when:
- Need advanced targeting
- A/B testing built-in
- Multiple environments
- Audit/compliance features
- SDKs for many languages
- Don't want to maintain infrastructure

Popular options:
- LaunchDarkly (enterprise)
- Split.io (experimentation focus)
- Unleash (open source)
- Flagsmith (open source)
- ConfigCat (simple, affordable)

Anti-Patterns

1. Permanent "temporary" flags
   - Set cleanup deadlines
   - Alert on stale flags
   
2. Flags that affect data models
   - Hard to roll back
   - Consider data migration instead

3. Too many flags
   - Cognitive overhead
   - Interaction complexity
   - Set org-wide limits

4. Testing only happy path
   - Test both flag states
   - Test flag transitions

5. No monitoring
   - Track flag evaluations
   - Alert on unexpected states

Feature Flags ​

TL;DR ​

Why Feature Flags? ​

Deployment vs. Release ​

Types of Feature Flags ​

Release Flags (Short-lived) ​

Experiment Flags (Temporary) ​

Ops Flags (Long-lived) ​

Permission Flags (Long-lived) ​

Flag Evaluation ​

Simple Boolean ​

Percentage Rollout ​

Targeting Rules ​

Multivariate Flags ​

Implementation Architecture ​

Client-Side Evaluation ​

Server-Side Evaluation ​

Hybrid Approach ​

SDK Implementation ​

Python SDK Example ​

Best Practices ​

Flag Naming Conventions ​

Flag Lifecycle Management ​

Avoid Flag Debt ​

Testing with Flags ​

Feature Flag Services ​

LaunchDarkly ​

Unleash (Open Source) ​

Build vs. Buy ​

Anti-Patterns ​

References ​

Feature Flags

TL;DR

Why Feature Flags?

Deployment vs. Release

Types of Feature Flags

Release Flags (Short-lived)

Experiment Flags (Temporary)

Ops Flags (Long-lived)

Permission Flags (Long-lived)

Flag Evaluation

Simple Boolean

Percentage Rollout

Targeting Rules

Multivariate Flags

Implementation Architecture

Client-Side Evaluation

Server-Side Evaluation

Hybrid Approach

SDK Implementation

Python SDK Example

Best Practices

Flag Naming Conventions

Flag Lifecycle Management

Avoid Flag Debt

Testing with Flags

Feature Flag Services

LaunchDarkly

Unleash (Open Source)

Build vs. Buy

Anti-Patterns

References