OpenTelemetry Sampling [Python]
What is sampling?
Sampling is a process that restricts the amount of traces that are generated by a system. In high-volume applications, collecting 100% of traces can be expensive and unnecessary. Sampling allows you to collect a representative subset of traces while reducing costs and performance overhead.
Python sampling
OpenTelemetry Python SDK provides head-based sampling capabilities where the sampling decision is made at the beginning of a trace. By default, the tracer provider uses a ParentBased sampler with the AlwaysOnSampler. A sampler can be set on the tracer provider when creating it.
Built-in samplers
AlwaysOnSampler
Samples every trace. Useful for development environments but be careful in production with significant traffic:
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.sampling import AlwaysOnSampler
tracer_provider = TracerProvider(sampler=AlwaysOnSampler())
AlwaysOffSampler
Samples no traces. Useful for completely disabling tracing:
from opentelemetry.sdk.trace.sampling import AlwaysOffSampler
tracer_provider = TracerProvider(sampler=AlwaysOffSampler())
TraceIdRatioBased
Samples a fraction of spans based on the trace ID. The fraction should be between 0.0 and 1.0:
from opentelemetry.sdk.trace.sampling import TraceIdRatioBased
# Sample 10% of traces
tracer_provider = TracerProvider(sampler=TraceIdRatioBased(0.1))
# Sample 50% of traces
tracer_provider = TracerProvider(sampler=TraceIdRatioBased(0.5))
ParentBased
A sampler decorator that behaves differently based on the parent of the span. If the span has no parent, the decorated sampler is used to make the sampling decision:
from opentelemetry.sdk.trace.sampling import ParentBased, TraceIdRatioBased, AlwaysOnSampler
# ParentBased with TraceIdRatioBased root sampler
tracer_provider = TracerProvider(
sampler=ParentBased(TraceIdRatioBased(0.1))
)
# ParentBased with AlwaysOnSampler root sampler (default behavior)
tracer_provider = TracerProvider(
sampler=ParentBased(AlwaysOnSampler())
)
Configuration in Python
Environment variables
You can configure sampling using environment variables:
# TraceIdRatio sampler with 50% sampling
export OTEL_TRACES_SAMPLER="traceidratio"
export OTEL_TRACES_SAMPLER_ARG="0.5"
# ParentBased with TraceIdRatio
export OTEL_TRACES_SAMPLER="parentbased_traceidratio"
export OTEL_TRACES_SAMPLER_ARG="0.1"
# Always sample
export OTEL_TRACES_SAMPLER="always_on"
# Never sample
export OTEL_TRACES_SAMPLER="always_off"
Programmatic configuration
import os
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.trace.sampling import (
AlwaysOnSampler,
AlwaysOffSampler,
ParentBased,
TraceIdRatioBased,
)
from opentelemetry.sdk.resources import Resource
from opentelemetry.semconv.resource import ResourceAttributes
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
def setup_tracing():
# Create OTLP exporter
exporter = OTLPSpanExporter(
endpoint="https://api.uptrace.dev:4317",
headers={"uptrace-dsn": os.getenv("UPTRACE_DSN")},
)
# Create resource
resource = Resource.create({
ResourceAttributes.SERVICE_NAME: "my-service",
ResourceAttributes.SERVICE_VERSION: "1.0.0",
})
# Configure sampler based on environment
env = os.getenv("APP_ENV", "development")
if env == "development":
sampler = AlwaysOnSampler()
elif env == "production":
sampler = ParentBased(TraceIdRatioBased(0.1)) # 10% sampling
elif env == "testing":
sampler = AlwaysOffSampler()
else:
sampler = ParentBased(TraceIdRatioBased(0.25)) # 25% sampling
# Create tracer provider
tracer_provider = TracerProvider(
sampler=sampler,
resource=resource
)
# Add span processor
tracer_provider.add_span_processor(
BatchSpanProcessor(exporter)
)
# Set global tracer provider
trace.set_tracer_provider(tracer_provider)
return tracer_provider
# Usage
setup_tracing()
tracer = trace.get_tracer(__name__)
Custom sampler
You can create custom sampling logic by implementing the Sampler interface:
from typing import Optional, Sequence
from opentelemetry.context import Context
from opentelemetry.sdk.trace.sampling import Sampler, SamplingResult, Decision
from opentelemetry.trace import Link, SpanKind
from opentelemetry.trace.span import TraceState
from opentelemetry.util.types import Attributes
class CustomSampler(Sampler):
def __init__(self, high_priority_rate: float = 1.0, default_rate: float = 0.1):
self.high_priority_rate = high_priority_rate
self.default_rate = default_rate
def should_sample(
self,
parent_context: Optional[Context],
trace_id: int,
name: str,
kind: SpanKind = None,
attributes: Attributes = None,
links: Optional[Sequence[Link]] = None,
trace_state: Optional[TraceState] = None,
) -> SamplingResult:
# Sample high-priority operations at higher rate
if attributes and attributes.get("priority") == "high":
rate = self.high_priority_rate
else:
rate = self.default_rate
# Use trace_id for deterministic sampling
if (trace_id & 0xFFFFFFFFFFFFFFFF) < rate * 0xFFFFFFFFFFFFFFFF:
return SamplingResult(Decision.RECORD_AND_SAMPLE)
else:
return SamplingResult(Decision.DROP)
def get_description(self) -> str:
return f"CustomSampler(high_priority_rate={self.high_priority_rate}, default_rate={self.default_rate})"
# Use custom sampler
tracer_provider = TracerProvider(
sampler=CustomSampler(high_priority_rate=1.0, default_rate=0.1)
)
Debugging sampling
Check sampling decisions
from opentelemetry import trace
from opentelemetry.sdk.trace import Span
def log_sampling_decision(span_name: str):
with tracer.start_as_current_span(span_name) as span:
if isinstance(span, Span):
context = span.get_span_context()
print(f"Span '{span_name}' - Trace ID: {context.trace_id:032x}")
print(f"Sampled: {context.trace_flags.sampled}")
print(f"Valid: {context.is_valid}")
# Your business logic here
pass
# Test sampling
tracer = trace.get_tracer(__name__)
for i in range(10):
log_sampling_decision(f"test-span-{i}")
Monitor sampling rates
import time
from collections import defaultdict
from opentelemetry import trace
from opentelemetry.sdk.trace import Span
class SamplingMonitor:
def __init__(self):
self.total_spans = 0
self.sampled_spans = 0
self.start_time = time.time()
def record_span(self, span):
self.total_spans += 1
if isinstance(span, Span):
context = span.get_span_context()
if context.trace_flags.sampled:
self.sampled_spans += 1
def get_sampling_rate(self):
if self.total_spans == 0:
return 0.0
return self.sampled_spans / self.total_spans
def get_stats(self):
elapsed = time.time() - self.start_time
return {
"total_spans": self.total_spans,
"sampled_spans": self.sampled_spans,
"sampling_rate": self.get_sampling_rate(),
"elapsed_seconds": elapsed,
"spans_per_second": self.total_spans / elapsed if elapsed > 0 else 0
}
# Usage
monitor = SamplingMonitor()
tracer = trace.get_tracer(__name__)
for i in range(100):
with tracer.start_as_current_span(f"test-span-{i}") as span:
monitor.record_span(span)
time.sleep(0.01)
print(monitor.get_stats())
Production considerations
Sampling in microservices
In a microservices architecture, sampling decisions should be made at the root of the trace and propagated to all services. This ensures consistent sampling across the entire distributed trace.
# Use ParentBased sampler in all services
from opentelemetry.sdk.trace.sampling import ParentBased, TraceIdRatioBased
# Root services (entry points) use TraceIdRatioBased
root_sampler = ParentBased(TraceIdRatioBased(0.1))
# Downstream services respect parent sampling decision
downstream_sampler = ParentBased(TraceIdRatioBased(0.1))
tracer_provider = TracerProvider(sampler=root_sampler)
Performance impact
Sampling reduces the performance overhead of tracing:
- CPU usage: Fewer spans to process and export
- Memory usage: Smaller trace buffers
- Network usage: Less data sent to backend
- Storage costs: Reduced storage requirements
Sampling strategies by environment
def get_sampler_for_environment(env: str):
if env == "production":
# Conservative sampling for production
return ParentBased(TraceIdRatioBased(0.01)) # 1%
elif env == "staging":
# Moderate sampling for staging
return ParentBased(TraceIdRatioBased(0.1)) # 10%
elif env == "development":
# Full sampling for development
return AlwaysOnSampler()
else:
# Safe default
return ParentBased(TraceIdRatioBased(0.1))