OpenTelemetry Collector Configuration Tutorial
OpenTelemetry Collector receives, processes, and exports telemetry data (traces, metrics, logs) from applications. Configure receivers (OTLP, Prometheus), processors (batch, sampling), and exporters (Jaeger, cloud providers) using YAML. Default ports:4317 (gRPC), 4318 (HTTP). Validate with otelcol-contrib --config=config.yaml --dry-run.
What You'll Learn
By the end of this tutorial, you'll know how to:
- Set up a basic Collector configuration
- Configure receivers, processors, and exporters
- Handle different telemetry data types
- Implement common use cases like data sampling and enrichment
- Troubleshoot configuration issues
Prerequisites
- Basic understanding of YAML
- Familiarity with observability concepts (traces, metrics, logs)
- A running application that generates telemetry data (we'll provide examples if you don't have one)
Collector Architecture
The Collector architecture consists of:
[Your App] → [Receivers] → [Processors] → [Exporters] → [Backend]
- Receivers: How data gets into the Collector (OTLP, Jaeger, Prometheus, etc.)
- Processors: What happens to data in transit (sampling, filtering, enriching)
- Exporters: Where data goes next (Jaeger, Prometheus, cloud providers)
- Pipelines: Connect receivers → processors → exporters for each data type
Quick Reference
Essential Components to Remember
otlpreceiver for modern appsbatchprocessor for performancememory_limiterprocessor for stabilitydebugexporter for testing
Default Ports
- 4317: OTLP gRPC
- 4318: OTLP HTTP
- 8888: Collector metrics
- 8889: Prometheus exporter (if configured)
Useful Commands
# Validate config
otelcol-contrib --config=config.yaml --dry-run
# Run with debug logging
otelcol-contrib --config=config.yaml --log-level=debug
# Check collector health
curl http://localhost:8888/metrics
Basic Configuration Structure
Every Collector config follows this YAML structure:
# collector-config.yaml
receivers:
# How to receive data
processors:
# How to process data (optional)
exporters:
# Where to send data
service:
pipelines:
# Connect everything together
First Configuration
Let's start with a minimal setup that receives OTLP data and exports it to the console:
# basic-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
exporters:
debug:
verbosity: detailed
service:
pipelines:
traces:
receivers: [otlp]
exporters: [debug]
metrics:
receivers: [otlp]
exporters: [debug]
logs:
receivers: [otlp]
exporters: [debug]
Run it:
otelcol --config=basic-config.yaml
This config:
- Accepts OTLP data on standard ports (4317 for gRPC, 4318 for HTTP)
- Prints all received data to the console
- Handles traces, metrics, and logs separately
Configuration Example
Here's a more practical setup that you might use in production:
# production-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
# Scrape Prometheus metrics
prometheus:
config:
scrape_configs:
- job_name: 'my-service'
static_configs:
- targets: ['localhost:8080']
processors:
# Sample traces to reduce volume
probabilistic_sampler:
sampling_percentage: 10.0
# Add resource attributes
resource:
attributes:
- key: environment
value: production
action: upsert
- key: service.version
from_attribute: app.version
action: insert
# Batch data for efficiency
batch:
timeout: 1s
send_batch_size: 1024
exporters:
# Export to Jaeger
jaeger:
endpoint: jaeger-collector:14250
tls:
insecure: true
# Export to Prometheus
prometheus:
endpoint: "0.0.0.0:8889"
# Export to cloud provider
otlp/uptrace:
endpoint: https://api.uptrace.dev:4317
headers:
"uptrace-dsn": "${UPTRACE_DSN}"
service:
pipelines:
traces:
receivers: [otlp]
processors: [probabilistic_sampler, resource, batch]
exporters: [jaeger, otlp/uptrace]
metrics:
receivers: [otlp, prometheus]
processors: [resource, batch]
exporters: [prometheus, otlp/uptrace]
logs:
receivers: [otlp]
processors: [resource, batch]
exporters: [otlp/uptrace]
Configuration Patterns
Pattern 1: Multi-Environment Setup
Use different configs for dev/staging/prod:
# Use environment variables for flexibility
exporters:
jaeger:
endpoint: ${JAEGER_ENDPOINT}
processors:
probabilistic_sampler:
sampling_percentage: ${SAMPLING_RATE:100.0} # Default 100% if not set
Pattern 2: Data Enrichment
Add context to your telemetry:
processors:
resource:
attributes:
- key: k8s.cluster.name
value: ${K8S_CLUSTER_NAME}
action: upsert
- key: deployment.environment
value: ${ENVIRONMENT}
action: upsert
transform:
trace_statements:
- context: span
statements:
- set(attributes["custom.field"], "processed-by-collector")
Pattern 3: Data Filtering
Remove unwanted data:
processors:
filter:
traces:
span:
- 'attributes["http.route"] == "/health"'
- 'name == "GET /metrics"'
metrics:
metric:
- 'name == "unwanted_metric"'
Receiver
OTLP Receiver (Most Common)
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
# Optional: TLS configuration
tls:
cert_file: /path/to/cert.pem
key_file: /path/to/key.pem
http:
endpoint: 0.0.0.0:4318
cors:
allowed_origins: ["*"] # Be more restrictive in production
Prometheus Receiver
receivers:
prometheus:
config:
scrape_configs:
- job_name: 'my-app'
scrape_interval: 30s
static_configs:
- targets: ['app:8080']
metrics_path: /metrics
Filelog Receiver (for logs)
receivers:
filelog:
include: [/var/log/myapp/*.log]
operators:
- type: json_parser
timestamp:
parse_from: attributes.timestamp
layout: '%Y-%m-%d %H:%M:%S'
Processor
Batch Processor (Essential for Performance)
processors:
batch:
timeout: 1s # How long to wait before sending
send_batch_size: 512 # Send when this many items collected
send_batch_max_size: 1024 # Never exceed this size
Sampling Processors
processors:
# Sample 10% of traces
probabilistic_sampler:
sampling_percentage: 10.0
# More sophisticated sampling
tail_sampling:
decision_wait: 10s
policies:
- name: error-traces
type: status_code
status_code: {status_codes: [ERROR]}
- name: slow-traces
type: latency
latency: {threshold_ms: 1000}
- name: random-sample
type: probabilistic
probabilistic: {sampling_percentage: 1.0}
Exporter
Exporters send processed data to one or more backends. For complete exporter configuration including all available backends, authentication, and production patterns, see OpenTelemetry Collector Exporters.
Cloud Provider Exporters
exporters:
# Google Cloud
googlecloud:
project: my-gcp-project
# AWS X-Ray
awsxray:
region: us-west-2
# Azure Monitor
azuremonitor:
connection_string: ${APPLICATIONINSIGHTS_CONNECTION_STRING}
File Exporter (for debugging)
exporters:
file:
path: /tmp/otel-data.json
rotation:
max_megabytes: 100
max_days: 7
max_backups: 3
Environment Variables and Secrets
Keep sensitive data out of your config files:
# In your config
exporters:
otlp/backend:
endpoint: ${BACKEND_ENDPOINT}
headers:
authorization: "Bearer ${API_TOKEN}"
# In your environment
export BACKEND_ENDPOINT="https://api.example.com"
export API_TOKEN="your-secret-token"
Docker Deployment
Here's a complete Docker setup:
# Dockerfile
FROM otel/opentelemetry-collector-contrib:latest
COPY collector-config.yaml /etc/otelcol-contrib/config.yaml
EXPOSE 4317 4318 8889
# docker-compose.yml
version: '3.8'
services:
otel-collector:
build: .
ports:
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP
- "8889:8889" # Prometheus metrics
environment:
- JAEGER_ENDPOINT=http://jaeger:14250
volumes:
- ./logs:/var/log
Testing Your Configuration
1. Validate Syntax
otelcol-contrib --config=your-config.yaml --dry-run
2. Check What's Running
# The collector exposes metrics about itself
curl http://localhost:8888/metrics
3. Send Test Data
# Send a test trace using curl
curl -X POST http://localhost:4318/v1/traces \
-H "Content-Type: application/json" \
-d '{
"resourceSpans": [{
"resource": {"attributes": [{"key": "service.name", "value": {"stringValue": "test-service"}}]},
"scopeSpans": [{
"spans": [{
"traceId": "5b8aa5a2d2c872e8321cf37308d69df2",
"spanId": "051581bf3cb55c13",
"name": "test-span",
"kind": "SPAN_KIND_CLIENT",
"startTimeUnixNano": "1640995200000000000",
"endTimeUnixNano": "1640995200100000000"
}]
}]
}]
}'
Common Troubleshooting
Configuration Not Loading
- Check YAML syntax (indentation matters!)
- Verify file permissions
- Look for typos in component names
Data Not Flowing
- Check that receivers are listening on the right ports
- Verify pipeline connections (receivers → processors → exporters)
- Look at collector logs:
otelcol-contrib --config=config.yaml --log-level=debug
Performance Issues
- Add batch processor if missing
- Reduce sampling rates
- Check memory_limiter processor configuration
Memory Usage Growing
processors:
memory_limiter:
limit_mib: 512
spike_limit_mib: 128
Advanced Use Cases
Multi-Pipeline Setup
service:
pipelines:
# High-priority traces (errors) with no sampling
traces/errors:
receivers: [otlp]
processors: [filter/errors, batch]
exporters: [jaeger]
# Normal traces with sampling
traces/sampled:
receivers: [otlp]
processors: [filter/normal, probabilistic_sampler, batch]
exporters: [jaeger]
Data Routing by Attributes
processors:
routing:
from_attribute: service.name
table:
- value: frontend
exporters: [jaeger, prometheus/frontend]
- value: backend
exporters: [jaeger, prometheus/backend]
Got questions? The OpenTelemetry community is incredibly helpful—check out the CNCF Slack #opentelemetry channel or the GitHub discussions.