OpenTelemetry Go Sampling
What is sampling?
Sampling is a process that restricts the amount of spans that are generated by a system. In high-volume applications, collecting 100% of traces can be expensive and unnecessary. Sampling allows you to collect a representative subset of traces while reducing costs and performance overhead.
Go sampling overview
OpenTelemetry Go SDK provides head-based sampling capabilities where the sampling decision is made at the beginning of a trace. By default, the tracer provider uses a ParentBased sampler with the AlwaysSample sampler. You can configure a sampler on the tracer provider using the WithSampler option.
Recommended: Consistent probability sampler
For production use, we recommend using the consistent probability based sampler from the official contrib package. This sampler implements the OpenTelemetry specification for consistent probability sampling and records sampling information in the tracestate, enabling accurate span-to-metrics pipelines and ensuring traces remain complete across service boundaries.
Why use consistent probability sampling?
The consistent probability sampler propagates two values via the tracestate: "p-value" (sampling probability) and "r-value" (randomness source). This approach provides several benefits:
- Consistent decisions: All services make the same sampling decision for a trace
- Accurate metrics: Spans can be accurately counted using span-to-metrics pipelines
- Complete traces: Traces tend to be complete even when spans make independent sampling decisions
- Cross-service compatibility: Works correctly across different services and vendors
Basic usage
import (
"go.opentelemetry.io/contrib/samplers/probability/consistent"
"go.opentelemetry.io/otel/sdk/trace"
)
// Recommended: Use consistent probability sampling
sampler := consistent.ProbabilityBased(0.1) // Sample 10% of traces
// Wrap with ParentBased for proper parent respect
provider := trace.NewTracerProvider(
trace.WithSampler(consistent.ParentProbabilityBased(sampler)),
)
Advanced consistent sampling
import (
"crypto/rand"
"go.opentelemetry.io/contrib/samplers/probability/consistent"
)
// With custom random source for deterministic testing
sampler := consistent.ProbabilityBased(0.25,
consistent.WithRandomSource(rand.NewSource(42)))
// Parent-based consistent sampler with different child behaviors
sampler := consistent.ParentProbabilityBased(
consistent.ProbabilityBased(0.1), // Root sampler
trace.WithRemoteParentSampled(consistent.ProbabilityBased(0.05)),
trace.WithLocalParentSampled(consistent.ProbabilityBased(0.15)),
)
Built-in samplers
While the consistent probability sampler is recommended for production, the Go SDK also provides these built-in samplers:
AlwaysSample
Samples every trace. Useful for development environments but use with caution in production with significant traffic:
import (
"go.opentelemetry.io/otel/sdk/trace"
)
provider := trace.NewTracerProvider(
trace.WithSampler(trace.AlwaysSample()),
)
NeverSample
Samples no traces. Useful for completely disabling tracing:
provider := trace.NewTracerProvider(
trace.WithSampler(trace.NeverSample()),
)
TraceIDRatioBased
Samples a fraction of traces based on the trace ID. The fraction should be between 0.0 and 1.0:
// Sample 10% of traces
provider := trace.NewTracerProvider(
trace.WithSampler(trace.TraceIDRatioBased(0.1)),
)
// Sample 50% of traces
provider := trace.NewTracerProvider(
trace.WithSampler(trace.TraceIDRatioBased(0.5)),
)
ParentBased
A sampler decorator that behaves differently based on the parent of the span. If the span has no parent, the decorated sampler is used to make the sampling decision. If the span has a parent, the sampler follows the parent's sampling decision:
// ParentBased with TraceIDRatioBased root sampler
provider := trace.NewTracerProvider(
trace.WithSampler(trace.ParentBased(trace.TraceIDRatioBased(0.1))),
)
// ParentBased with AlwaysSample root sampler (default behavior)
provider := trace.NewTracerProvider(
trace.WithSampler(trace.ParentBased(trace.AlwaysSample())),
)
Configuration options
Environment variables
For consistent probability sampling, you should configure it programmatically. For built-in samplers, you can use environment variables:
# Built-in TraceIDRatioBased sampler with 50% sampling
export OTEL_TRACES_SAMPLER="traceidratio"
export OTEL_TRACES_SAMPLER_ARG="0.5"
# Built-in ParentBased with TraceIDRatioBased
export OTEL_TRACES_SAMPLER="parentbased_traceidratio"
export OTEL_TRACES_SAMPLER_ARG="0.1"
# Always sample
export OTEL_TRACES_SAMPLER="always_on"
# Never sample
export OTEL_TRACES_SAMPLER="always_off"
Note: Environment variable configuration does not support the consistent probability sampler. Use programmatic configuration for consistent sampling.
Programmatic configuration with consistent sampling
package main
import (
"context"
"log"
"os"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
"go.opentelemetry.io/otel/sdk/resource"
"go.opentelemetry.io/otel/sdk/trace"
"go.opentelemetry.io/contrib/samplers/probability/consistent"
semconv "go.opentelemetry.io/otel/semconv/v1.17.0"
)
func setupTracing(ctx context.Context) func() {
// Create OTLP exporter
exporter, err := otlptracegrpc.New(ctx,
otlptracegrpc.WithEndpoint("https://api.uptrace.dev:4317"),
otlptracegrpc.WithHeaders(map[string]string{
"uptrace-dsn": os.Getenv("UPTRACE_DSN"),
}),
)
if err != nil {
log.Fatal(err)
}
// Create resource
res, err := resource.Merge(
resource.Default(),
resource.NewWithAttributes(
semconv.SchemaURL,
semconv.ServiceName("my-service"),
semconv.ServiceVersion("1.0.0"),
),
)
if err != nil {
log.Fatal(err)
}
// Configure consistent probability sampler based on environment
var sampler trace.Sampler
env := os.Getenv("GO_ENV")
switch env {
case "development":
sampler = trace.AlwaysSample()
case "production":
// Recommended: Use consistent probability sampling for production
sampler = consistent.ParentProbabilityBased(
consistent.ProbabilityBased(0.1), // 10% sampling
)
default:
sampler = consistent.ParentProbabilityBased(
consistent.ProbabilityBased(0.25), // 25% sampling
)
}
// Create tracer provider
tp := trace.NewTracerProvider(
trace.WithBatcher(exporter),
trace.WithResource(res),
trace.WithSampler(sampler),
)
otel.SetTracerProvider(tp)
return func() {
if err := tp.Shutdown(ctx); err != nil {
log.Printf("Error shutting down tracer provider: %v", err)
}
}
}
Custom samplers
Create custom sampling logic by implementing the Sampler interface:
package main
import (
"strings"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/sdk/trace"
)
// CustomSampler implements custom sampling logic
type CustomSampler struct {
defaultSampler trace.Sampler
}
func NewCustomSampler() trace.Sampler {
return &CustomSampler{
defaultSampler: trace.TraceIDRatioBased(0.1), // 10% default sampling
}
}
func (s *CustomSampler) ShouldSample(p trace.SamplingParameters) trace.SamplingResult {
// Always sample spans with errors
for _, attr := range p.Attributes {
if attr.Key == "error" && attr.Value.AsBool() {
return trace.SamplingResult{
Decision: trace.RecordAndSample,
Tracestate: p.ParentContext.TraceState(),
}
}
}
// Always sample critical operations
spanName := p.Name
if strings.Contains(spanName, "critical") ||
strings.Contains(spanName, "payment") ||
strings.Contains(spanName, "auth") {
return trace.SamplingResult{
Decision: trace.RecordAndSample,
Tracestate: p.ParentContext.TraceState(),
}
}
// Don't sample health checks
if strings.Contains(spanName, "health") || strings.Contains(spanName, "ping") {
return trace.SamplingResult{
Decision: trace.Drop,
Tracestate: p.ParentContext.TraceState(),
}
}
// Use default sampler for other cases
return s.defaultSampler.ShouldSample(p)
}
func (s *CustomSampler) Description() string {
return "CustomSampler{errors=always,critical=always,health=never,default=10%}"
}
// Usage
func main() {
provider := trace.NewTracerProvider(
trace.WithSampler(NewCustomSampler()),
// other options...
)
otel.SetTracerProvider(provider)
}
Advanced sampling scenarios
Attribute-based sampling
import "fmt"
type AttributeBasedSampler struct {
highPrioritySampler trace.Sampler
defaultSampler trace.Sampler
}
func NewAttributeBasedSampler() trace.Sampler {
return &AttributeBasedSampler{
highPrioritySampler: trace.AlwaysSample(),
defaultSampler: trace.TraceIDRatioBased(0.05), // 5% default
}
}
func (s *AttributeBasedSampler) ShouldSample(p trace.SamplingParameters) trace.SamplingResult {
// Check for high priority attributes
for _, attr := range p.Attributes {
switch attr.Key {
case "http.route":
route := attr.Value.AsString()
// Always sample admin routes
if strings.HasPrefix(route, "/admin") {
return s.highPrioritySampler.ShouldSample(p)
}
// Sample 50% of API routes
if strings.HasPrefix(route, "/api") {
return trace.TraceIDRatioBased(0.5).ShouldSample(p)
}
case "user.tier":
// Always sample premium users
if attr.Value.AsString() == "premium" {
return s.highPrioritySampler.ShouldSample(p)
}
case "http.status_code":
// Always sample error responses
statusCode := attr.Value.AsInt64()
if statusCode >= 400 {
return s.highPrioritySampler.ShouldSample(p)
}
}
}
return s.defaultSampler.ShouldSample(p)
}
func (s *AttributeBasedSampler) Description() string {
return "AttributeBasedSampler{admin=always,api=50%,premium=always,errors=always,default=5%}"
}
Rate-limiting sampler
import (
"fmt"
"sync"
"time"
)
type RateLimitingSampler struct {
maxSamplesPerSecond int
lastSecond int64
currentCount int
mutex sync.Mutex
}
func NewRateLimitingSampler(maxSamplesPerSecond int) trace.Sampler {
return &RateLimitingSampler{
maxSamplesPerSecond: maxSamplesPerSecond,
}
}
func (s *RateLimitingSampler) ShouldSample(p trace.SamplingParameters) trace.SamplingResult {
s.mutex.Lock()
defer s.mutex.Unlock()
now := time.Now().Unix()
if now != s.lastSecond {
s.lastSecond = now
s.currentCount = 0
}
if s.currentCount < s.maxSamplesPerSecond {
s.currentCount++
return trace.SamplingResult{
Decision: trace.RecordAndSample,
Tracestate: p.ParentContext.TraceState(),
}
}
return trace.SamplingResult{
Decision: trace.Drop,
Tracestate: p.ParentContext.TraceState(),
}
}
func (s *RateLimitingSampler) Description() string {
return fmt.Sprintf("RateLimitingSampler{%d samples/second}", s.maxSamplesPerSecond)
}
Production deployment examples
HTTP server implementation
package main
import (
"context"
"fmt"
"log"
"net/http"
"os"
"time"
"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
"go.opentelemetry.io/contrib/samplers/probability/consistent"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
"go.opentelemetry.io/otel/sdk/resource"
"go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/semconv/v1.17.0"
)
func main() {
ctx := context.Background()
// Setup tracing
shutdown := setupTracing(ctx)
defer shutdown()
// Create HTTP handler with OpenTelemetry instrumentation
handler := http.HandlerFunc(handleRequest)
wrappedHandler := otelhttp.NewHandler(handler, "my-server")
// Start server
log.Println("Server starting on :8080")
if err := http.ListenAndServe(":8080", wrappedHandler); err != nil {
log.Fatal(err)
}
}
func handleRequest(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
tracer := otel.Tracer("my-service")
// Create custom span
ctx, span := tracer.Start(ctx, "handle-request")
defer span.End()
// Check if span is recording to avoid expensive operations
if span.IsRecording() {
span.SetAttributes(
attribute.String("http.method", r.Method),
attribute.String("http.url", r.URL.String()),
attribute.String("user.agent", r.UserAgent()),
)
}
// Simulate some work
time.Sleep(50 * time.Millisecond)
// Add more attributes only if recording
if span.IsRecording() {
span.SetAttributes(
attribute.Int("http.status_code", 200),
attribute.String("response.type", "success"),
)
}
w.WriteHeader(http.StatusOK)
fmt.Fprintf(w, "Hello, World!")
}
func setupTracing(ctx context.Context) func() {
exporter, err := otlptracegrpc.New(ctx,
otlptracegrpc.WithEndpoint("https://api.uptrace.dev:4317"),
otlptracegrpc.WithHeaders(map[string]string{
"uptrace-dsn": os.Getenv("UPTRACE_DSN"),
}),
)
if err != nil {
log.Fatal(err)
}
res, err := resource.Merge(
resource.Default(),
resource.NewWithAttributes(
semconv.SchemaURL,
semconv.ServiceName("my-http-server"),
semconv.ServiceVersion("1.0.0"),
semconv.DeploymentEnvironment(os.Getenv("ENVIRONMENT")),
),
)
if err != nil {
log.Fatal(err)
}
// Configure sampling based on environment
var sampler trace.Sampler
switch os.Getenv("ENVIRONMENT") {
case "development":
sampler = trace.AlwaysSample()
case "production":
// Recommended: Use consistent probability sampling for production
sampler = consistent.ParentProbabilityBased(
consistent.ProbabilityBased(0.1), // 10% sampling
)
default:
sampler = consistent.ParentProbabilityBased(
consistent.ProbabilityBased(0.25), // 25% sampling
)
}
tp := trace.NewTracerProvider(
trace.WithBatcher(exporter,
trace.WithBatchTimeout(time.Second),
trace.WithMaxExportBatchSize(100),
),
trace.WithResource(res),
trace.WithSampler(sampler),
)
otel.SetTracerProvider(tp)
return func() {
if err := tp.Shutdown(ctx); err != nil {
log.Printf("Error shutting down tracer provider: %v", err)
}
}
}
Database operations with sampling
package main
import (
"context"
"database/sql"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/codes"
"go.opentelemetry.io/otel/trace"
)
type UserService struct {
db *sql.DB
tracer trace.Tracer
}
func NewUserService(db *sql.DB) *UserService {
return &UserService{
db: db,
tracer: otel.Tracer("user-service"),
}
}
func (s *UserService) CreateUser(ctx context.Context, user *User) error {
ctx, span := s.tracer.Start(ctx, "create-user")
defer span.End()
// Only add expensive attributes if recording
if span.IsRecording() {
span.SetAttributes(
attribute.String("user.email", user.Email),
attribute.String("user.role", user.Role),
attribute.Int("user.age", user.Age),
)
}
query := "INSERT INTO users (email, name, role, age) VALUES (?, ?, ?, ?)"
_, err := s.db.ExecContext(ctx, query, user.Email, user.Name, user.Role, user.Age)
if err != nil {
span.RecordError(err)
span.SetStatus(codes.Error, "Failed to create user")
return err
}
if span.IsRecording() {
span.SetAttributes(
attribute.String("operation.result", "success"),
)
span.AddEvent("user-created")
}
return nil
}
func (s *UserService) GetUser(ctx context.Context, userID int) (*User, error) {
ctx, span := s.tracer.Start(ctx, "get-user")
defer span.End()
if span.IsRecording() {
span.SetAttributes(
attribute.Int("user.id", userID),
)
}
var user User
query := "SELECT id, email, name, role, age FROM users WHERE id = ?"
err := s.db.QueryRowContext(ctx, query, userID).Scan(
&user.ID, &user.Email, &user.Name, &user.Role, &user.Age,
)
if err != nil {
if err == sql.ErrNoRows {
span.SetStatus(codes.Error, "User not found")
if span.IsRecording() {
span.SetAttributes(
attribute.String("error.type", "not_found"),
)
}
} else {
span.RecordError(err)
span.SetStatus(codes.Error, "Database query failed")
}
return nil, err
}
if span.IsRecording() {
span.SetAttributes(
attribute.String("user.email", user.Email),
attribute.String("operation.result", "success"),
)
}
return &user, nil
}
type User struct {
ID int `json:"id"`
Email string `json:"email"`
Name string `json:"name"`
Role string `json:"role"`
Age int `json:"age"`
}
Monitoring sampling performance
Statistics collector
package main
import (
"context"
"log"
"sync/atomic"
"time"
"go.opentelemetry.io/otel/sdk/trace"
)
type SamplingStatsCollector struct {
totalSpans int64
sampledSpans int64
ticker *time.Ticker
done chan bool
}
func NewSamplingStatsCollector() *SamplingStatsCollector {
return &SamplingStatsCollector{
ticker: time.NewTicker(30 * time.Second),
done: make(chan bool),
}
}
func (c *SamplingStatsCollector) Start() {
go func() {
for {
select {
case <-c.ticker.C:
c.logStats()
case <-c.done:
return
}
}
}()
}
func (c *SamplingStatsCollector) Stop() {
c.ticker.Stop()
c.done <- true
}
func (c *SamplingStatsCollector) RecordSpan(sampled bool) {
atomic.AddInt64(&c.totalSpans, 1)
if sampled {
atomic.AddInt64(&c.sampledSpans, 1)
}
}
func (c *SamplingStatsCollector) logStats() {
total := atomic.LoadInt64(&c.totalSpans)
sampled := atomic.LoadInt64(&c.sampledSpans)
if total > 0 {
rate := float64(sampled) / float64(total) * 100
log.Printf("Sampling statistics: %.2f%% (%d/%d spans sampled)",
rate, sampled, total)
// Alert if sampling rate is unexpected
if rate > 20.0 {
log.Printf("WARNING: High sampling rate detected: %.2f%%", rate)
} else if rate < 1.0 {
log.Printf("WARNING: Low sampling rate detected: %.2f%%", rate)
}
}
}
// Custom span processor to collect sampling stats
type SamplingStatsProcessor struct {
collector *SamplingStatsCollector
}
func NewSamplingStatsProcessor(collector *SamplingStatsCollector) trace.SpanProcessor {
return &SamplingStatsProcessor{
collector: collector,
}
}
func (p *SamplingStatsProcessor) OnStart(parent context.Context, s trace.ReadWriteSpan) {
// Record that a span was created
sampled := s.SpanContext().IsSampled()
p.collector.RecordSpan(sampled)
}
func (p *SamplingStatsProcessor) OnEnd(s trace.ReadOnlySpan) {
// Nothing to do on span end for stats collection
}
func (p *SamplingStatsProcessor) Shutdown(ctx context.Context) error {
return nil
}
func (p *SamplingStatsProcessor) ForceFlush(ctx context.Context) error {
return nil
}
// Usage example
func main() {
ctx := context.Background()
// Create stats collector
statsCollector := NewSamplingStatsCollector()
statsCollector.Start()
defer statsCollector.Stop()
// Setup tracer provider with stats processor
tp := trace.NewTracerProvider(
trace.WithSampler(trace.TraceIDRatioBased(0.1)),
trace.WithSpanProcessor(NewSamplingStatsProcessor(statsCollector)),
// other processors...
)
defer tp.Shutdown(ctx)
// Use the tracer...
}
Best practices
Performance optimization
When implementing sampling, consider these performance best practices:
- Check span recording status: Use
span.IsRecording()before adding expensive attributes or operations - Minimize allocations: Reuse attribute slices and avoid unnecessary string concatenations in hot paths
- Batch span processing: Use
WithBatcher()instead of simple span processors for better performance - Configure appropriate batch sizes: Balance memory usage and export frequency based on your traffic patterns
Production considerations
For production deployments:
- Use consistent probability sampling: Prefer the consistent probability sampler over built-in samplers for production workloads as it records sampling information in tracestate and enables accurate span-to-metrics pipelines
- Start conservative: Begin with lower sampling rates (1-5%) and increase gradually based on your observability needs
- Monitor sampling effectiveness: Track sampling rates and ensure you're capturing enough data for meaningful analysis
- Consider costs: Balance observability value with storage and processing costs
- Use environment-based configuration: Configure different sampling rates for development, staging, and production
- Implement fallback strategies: Ensure your application continues to function even if sampling configuration fails
- Verify tracestate consistency: When using consistent sampling, monitor traces to ensure they don't contain multiple distinct r-values, which would indicate inconsistent sampling