OpenTelemetry Sampling [Python]

What is sampling?

Sampling is a process that restricts the amount of traces that are generated by a system. In high-volume applications, collecting 100% of traces can be expensive and unnecessary. Sampling allows you to collect a representative subset of traces while reducing costs and performance overhead.

Python sampling

OpenTelemetry Python SDK provides head-based sampling capabilities where the sampling decision is made at the beginning of a trace. By default, the tracer provider uses a ParentBased sampler with the AlwaysOnSampler. A sampler can be set on the tracer provider when creating it.

Built-in samplers

AlwaysOnSampler

Samples every trace. Useful for development environments but be careful in production with significant traffic:

python
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.sampling import AlwaysOnSampler

tracer_provider = TracerProvider(sampler=AlwaysOnSampler())

AlwaysOffSampler

Samples no traces. Useful for completely disabling tracing:

python
from opentelemetry.sdk.trace.sampling import AlwaysOffSampler

tracer_provider = TracerProvider(sampler=AlwaysOffSampler())

TraceIdRatioBased

Samples a fraction of spans based on the trace ID. The fraction should be between 0.0 and 1.0:

python
from opentelemetry.sdk.trace.sampling import TraceIdRatioBased

# Sample 10% of traces
tracer_provider = TracerProvider(sampler=TraceIdRatioBased(0.1))

# Sample 50% of traces
tracer_provider = TracerProvider(sampler=TraceIdRatioBased(0.5))

ParentBased

A sampler decorator that behaves differently based on the parent of the span. If the span has no parent, the decorated sampler is used to make the sampling decision:

python
from opentelemetry.sdk.trace.sampling import ParentBased, TraceIdRatioBased, AlwaysOnSampler

# ParentBased with TraceIdRatioBased root sampler
tracer_provider = TracerProvider(
    sampler=ParentBased(TraceIdRatioBased(0.1))
)

# ParentBased with AlwaysOnSampler root sampler (default behavior)
tracer_provider = TracerProvider(
    sampler=ParentBased(AlwaysOnSampler())
)

Configuration in Python

Environment variables

You can configure sampling using environment variables:

bash
# TraceIdRatio sampler with 50% sampling
export OTEL_TRACES_SAMPLER="traceidratio"
export OTEL_TRACES_SAMPLER_ARG="0.5"

# ParentBased with TraceIdRatio
export OTEL_TRACES_SAMPLER="parentbased_traceidratio"
export OTEL_TRACES_SAMPLER_ARG="0.1"

# Always sample
export OTEL_TRACES_SAMPLER="always_on"

# Never sample
export OTEL_TRACES_SAMPLER="always_off"

Programmatic configuration

python
import os
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.trace.sampling import (
    AlwaysOnSampler,
    AlwaysOffSampler,
    ParentBased,
    TraceIdRatioBased,
)
from opentelemetry.sdk.resources import Resource
from opentelemetry.semconv.resource import ResourceAttributes
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

def setup_tracing():
    # Create OTLP exporter
    exporter = OTLPSpanExporter(
        endpoint="https://api.uptrace.dev:4317",
        headers={"uptrace-dsn": os.getenv("UPTRACE_DSN")},
    )

    # Create resource
    resource = Resource.create({
        ResourceAttributes.SERVICE_NAME: "my-service",
        ResourceAttributes.SERVICE_VERSION: "1.0.0",
    })

    # Configure sampler based on environment
    env = os.getenv("APP_ENV", "development")

    if env == "development":
        sampler = AlwaysOnSampler()
    elif env == "production":
        sampler = ParentBased(TraceIdRatioBased(0.1))  # 10% sampling
    elif env == "testing":
        sampler = AlwaysOffSampler()
    else:
        sampler = ParentBased(TraceIdRatioBased(0.25))  # 25% sampling

    # Create tracer provider
    tracer_provider = TracerProvider(
        sampler=sampler,
        resource=resource
    )

    # Add span processor
    tracer_provider.add_span_processor(
        BatchSpanProcessor(exporter)
    )

    # Set global tracer provider
    trace.set_tracer_provider(tracer_provider)

    return tracer_provider

# Usage
setup_tracing()
tracer = trace.get_tracer(__name__)

Custom sampler

You can create custom sampling logic by implementing the Sampler interface:

python
from typing import Optional, Sequence
from opentelemetry.context import Context
from opentelemetry.sdk.trace.sampling import Sampler, SamplingResult, Decision
from opentelemetry.trace import Link, SpanKind
from opentelemetry.trace.span import TraceState
from opentelemetry.util.types import Attributes

class CustomSampler(Sampler):
    def __init__(self, high_priority_rate: float = 1.0, default_rate: float = 0.1):
        self.high_priority_rate = high_priority_rate
        self.default_rate = default_rate

    def should_sample(
        self,
        parent_context: Optional[Context],
        trace_id: int,
        name: str,
        kind: SpanKind = None,
        attributes: Attributes = None,
        links: Optional[Sequence[Link]] = None,
        trace_state: Optional[TraceState] = None,
    ) -> SamplingResult:
        # Sample high-priority operations at higher rate
        if attributes and attributes.get("priority") == "high":
            rate = self.high_priority_rate
        else:
            rate = self.default_rate

        # Use trace_id for deterministic sampling
        if (trace_id & 0xFFFFFFFFFFFFFFFF) < rate * 0xFFFFFFFFFFFFFFFF:
            return SamplingResult(Decision.RECORD_AND_SAMPLE)
        else:
            return SamplingResult(Decision.DROP)

    def get_description(self) -> str:
        return f"CustomSampler(high_priority_rate={self.high_priority_rate}, default_rate={self.default_rate})"

# Use custom sampler
tracer_provider = TracerProvider(
    sampler=CustomSampler(high_priority_rate=1.0, default_rate=0.1)
)

Debugging sampling

Check sampling decisions

python
from opentelemetry import trace
from opentelemetry.sdk.trace import Span

def log_sampling_decision(span_name: str):
    with tracer.start_as_current_span(span_name) as span:
        if isinstance(span, Span):
            context = span.get_span_context()
            print(f"Span '{span_name}' - Trace ID: {context.trace_id:032x}")
            print(f"Sampled: {context.trace_flags.sampled}")
            print(f"Valid: {context.is_valid}")

        # Your business logic here
        pass

# Test sampling
tracer = trace.get_tracer(__name__)
for i in range(10):
    log_sampling_decision(f"test-span-{i}")

Monitor sampling rates

python
import time
from collections import defaultdict
from opentelemetry import trace
from opentelemetry.sdk.trace import Span

class SamplingMonitor:
    def __init__(self):
        self.total_spans = 0
        self.sampled_spans = 0
        self.start_time = time.time()

    def record_span(self, span):
        self.total_spans += 1
        if isinstance(span, Span):
            context = span.get_span_context()
            if context.trace_flags.sampled:
                self.sampled_spans += 1

    def get_sampling_rate(self):
        if self.total_spans == 0:
            return 0.0
        return self.sampled_spans / self.total_spans

    def get_stats(self):
        elapsed = time.time() - self.start_time
        return {
            "total_spans": self.total_spans,
            "sampled_spans": self.sampled_spans,
            "sampling_rate": self.get_sampling_rate(),
            "elapsed_seconds": elapsed,
            "spans_per_second": self.total_spans / elapsed if elapsed > 0 else 0
        }

# Usage
monitor = SamplingMonitor()
tracer = trace.get_tracer(__name__)

for i in range(100):
    with tracer.start_as_current_span(f"test-span-{i}") as span:
        monitor.record_span(span)
        time.sleep(0.01)

print(monitor.get_stats())

Production considerations

Sampling in microservices

In a microservices architecture, sampling decisions should be made at the root of the trace and propagated to all services. This ensures consistent sampling across the entire distributed trace.

python
# Use ParentBased sampler in all services
from opentelemetry.sdk.trace.sampling import ParentBased, TraceIdRatioBased

# Root services (entry points) use TraceIdRatioBased
root_sampler = ParentBased(TraceIdRatioBased(0.1))

# Downstream services respect parent sampling decision
downstream_sampler = ParentBased(TraceIdRatioBased(0.1))

tracer_provider = TracerProvider(sampler=root_sampler)

Performance impact

Sampling reduces the performance overhead of tracing:

  • CPU usage: Fewer spans to process and export
  • Memory usage: Smaller trace buffers
  • Network usage: Less data sent to backend
  • Storage costs: Reduced storage requirements

Sampling strategies by environment

python
def get_sampler_for_environment(env: str):
    if env == "production":
        # Conservative sampling for production
        return ParentBased(TraceIdRatioBased(0.01))  # 1%
    elif env == "staging":
        # Moderate sampling for staging
        return ParentBased(TraceIdRatioBased(0.1))   # 10%
    elif env == "development":
        # Full sampling for development
        return AlwaysOnSampler()
    else:
        # Safe default
        return ParentBased(TraceIdRatioBased(0.1))

What's next?