OpenTelemetry Operator for Kubernetes
⚡ If you have already followed our guide on Monitoring Kubernetes with OpenTelemetry Collector, this article will show you how to automate the deployment and instrumentation of your applications using the OpenTelemetry Operator.
Learn how to easily manage Collectors, enable auto-instrumentation, and scale observability in Kubernetes without manually editing YAML files.
What is OpenTelemetry Operator
The OpenTelemetry Operator simplifies deployment and management of OpenTelemetry components in Kubernetes. It automates the creation of Collectors, enables auto-instrumentation for workloads, and manages configurations using Kubernetes-native CRDs.
It leverages the Kubernetes Operator pattern to watch custom resources and maintain the desired state of your observability stack.
How the Operator Works
The Operator observes resources and automatically manages telemetry collection:
- OpenTelemetryCollector: Configures collectors for metrics, logs, and traces.
- Instrumentation: Injects language-specific agents (Java, Python, Node.js, .NET, Go) into applications.
- Annotations: Applied to pods to trigger auto-instrumentation.
Note: Starting with Kubernetes 1.29+, the Operator automatically uses native sidecar containers for better pod lifecycle management. This feature ensures sidecars start before the main container and shut down after it. The usage of native sidecars can be disabled with
--feature-gates=-sidecarcontainers.nativeif needed.
When to use OpenTelemetry Operator?
| Feature | Operator | Manual Deployment |
|---|---|---|
| Collector deployment | Automated via CRDs | Manual YAML files |
| Auto-instrumentation | Yes (Java, Python, Node.js, .NET, Go) | No - requires code changes |
| Configuration updates | Automatic reconciliation | Manual kubectl/helm commands |
| Multi-collector management | Simplified with CRDs | Complex YAML management |
| Best for | Dynamic environments, auto-instrumentation needs | Static setups, full control required |
The Operator is recommended when:
- You need auto-instrumentation without modifying application code
- Managing multiple collectors across different namespaces
- Want GitOps-friendly declarative configuration
Manual deployment is better when:
- Simple single-collector setup
- Need maximum control over every configuration detail
- Using custom deployment patterns not supported by Operator
Prerequisites
- Kubernetes 1.27+
- kubectl access with cluster admin permissions
- Helm 3.16+ (if using Helm)
- cert-manager installed for admission webhooks:bash
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.18.2/cert-manager.yaml
Installing the Operator
To install OpenTelemetry Operator in your Kubernetes cluster, you can use Helm or kubectl. Here's how to install it:
Using kubectl
kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml
Using Helm
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
helm install opentelemetry-operator open-telemetry/opentelemetry-operator \
--namespace opentelemetry-operator-system \
--create-namespace
Verify installation:
kubectl get pods -n opentelemetry-operator-system
Helm Chart Configuration
The Operator Helm chart provides various configuration options to customize your deployment.
💡 Complete Observability Stack: For production deployments, explore our
Uptrace Helm Charts which provide a
complete solution including Uptrace APM, OpenTelemetry Operator & Collector,
ClickHouse storage, PostgreSQL, and Redis — everything needed for production
Kubernetes deployments. See the deployment guide
for step-by-step instructions.
Custom Values
Create a values.yaml file to customize the installation:
# values.yaml
manager:
resources:
limits:
cpu: 200m
memory: 256Mi
requests:
cpu: 100m
memory: 128Mi
# Restrict operator to specific namespaces
env:
WATCH_NAMESPACE: "production,staging"
admissionWebhooks:
create: true
certManager:
enabled: true
kubernetesClusterDomain: cluster.local
Install with custom values:
helm install opentelemetry-operator open-telemetry/opentelemetry-operator \
--namespace opentelemetry-operator-system \
--create-namespace \
--values values.yaml
Common Configuration Options
| Parameter | Description | Default |
|---|---|---|
manager.replicas | Number of operator replicas | 1 |
manager.resources | Resource limits and requests | See values.yaml |
manager.env.WATCH_NAMESPACE | Limit operator to specific namespaces | "" (all) |
admissionWebhooks.certManager.enabled | Use cert-manager for webhook certificates | true |
admissionWebhooks.autoGenerateCert.enabled | Auto-generate self-signed certificates | false |
kubernetesClusterDomain | Cluster domain name | cluster.local |
Note: When using
manager.env.WATCH_NAMESPACE, you can specify multiple namespaces separated by commas (e.g.,"production,staging"). Leave empty to watch all namespaces.
Deploying a Collector
💡 Backend Configuration: This guide uses Uptrace as an example backend.
OpenTelemetry is vendor-neutral - you can use Jaeger, Grafana Cloud, Datadog,
Prometheus, or any OTLP-compatible platform.
See other backend examples below.
Define a Collector using CRDs:
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: otel-collector
spec:
mode: deployment
replicas: 2
config: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
timeout: 10s
send_batch_size: 512
resourcedetection:
detectors: [env, system, k8snode]
exporters:
otlp/uptrace:
endpoint: api.uptrace.dev:4317
headers:
uptrace-dsn: '<FIXME>'
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp/uptrace]
metrics:
receivers: [otlp]
processors: [resourcedetection, batch]
exporters: [otlp/uptrace]
Apply:
kubectl apply -f collector.yaml
kubectl get otelcol
Note: Adjust
batch.timeoutandsend_batch_sizein processors for high-throughput production workloads.
For file-based logs, see our guide on Filelog Receiver
Auto-Instrumentation
One of the powerful features of OpenTelemetry Operator is its ability to automatically instrument applications in your Kubernetes cluster. Enable by adding annotations to deployments:
metadata:
annotations:
instrumentation.opentelemetry.io/inject-java: "true"
Supported languages: Java, Python, Node.js, .NET, Go.
Important:
- Create the Instrumentation CRD before adding annotations to pods (see Custom Resource Definitions section)
- The OpenTelemetry Collector must be deployed and running before instrumentation works
- Restart pods after adding annotations for auto-instrumentation to take effect:
kubectl rollout restart deployment/<your-app>- Verify instrumentation by checking pod events:
kubectl describe pod <pod-name>
Custom Resource Definitions
The Operator introduces several CRDs for managing OpenTelemetry components.
OpenTelemetryCollector CRD
Complete example with all common fields:
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: example-collector
spec:
mode: deployment # deployment, daemonset, statefulset, or sidecar
replicas: 2
image: otel/opentelemetry-collector-contrib:0.136.0
# Environment variables
env:
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
# Resource configuration
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 128Mi
# Volume mounts
volumeMounts:
- name: config
mountPath: /conf
volumes:
- name: config
configMap:
name: extra-config
# Collector configuration
config: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
memory_limiter:
limit_mib: 512
exporters:
otlp/uptrace:
endpoint: api.uptrace.dev:4317
headers:
uptrace-dsn: ''
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [otlp/uptrace]
Instrumentation CRD
Configure auto-instrumentation for multiple languages:
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: my-instrumentation
spec:
exporter:
endpoint: http://otel-collector:4317
propagators:
- tracecontext
- baggage
sampler:
type: parentbased_traceidratio
argument: "1"
# Language-specific configurations
java:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
env:
- name: OTEL_JAVAAGENT_DEBUG
value: "false"
python:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest
nodejs:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest
dotnet:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-dotnet:latest
List Available CRDs
kubectl get crd | grep opentelemetry.io
Output:
instrumentations.opentelemetry.io
opentelemetrycollectors.opentelemetry.io
opampbridges.opentelemetry.io
Check resource status:
kubectl get otelcol example-collector -o yaml
kubectl describe instrumentation my-instrumentation
Kubernetes Cluster & Monitoring
You can combine Operator with OpenTelemetry Kubernetes receivers (k8scluster, kubeletstats) for complete metrics:
- Node metrics: CPU, memory, disk, network
- Pod metrics: Resource usage, restarts, phase
- Cluster events: Deployments, scaling, health
- Application metrics: Latency, throughput, errors
Deployment Patterns
DaemonSet collects node-level metrics and pod traces:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: otel-collector-daemonset
Deployment collects cluster-wide metrics:
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector-deployment
Multi-Cluster Setup
Deploy the Operator in each cluster using the main Collector configuration as a base.
Cluster-Specific Parameters
Adjust these parameters for each cluster:
| Parameter | Production | Staging | Development |
|---|---|---|---|
replicas | 3 | 2 | 1 |
namespace | observability | observability | dev-observability |
cluster.name | production | staging | development |
uptrace-dsn | <PROD_DSN> | <STAGING_DSN> | <DEV_DSN> |
Add Cluster Identifier
Add this processor to your collector config:
processors:
batch:
attributes: # ← Add this processor
actions:
- key: cluster.name
value: production # Change per cluster
action: insert
Update the pipeline to include the processor:
service:
pipelines:
traces:
receivers: [otlp]
processors: [attributes, batch] # ← Add attributes here
exporters: [otlp/uptrace]
Best Practices:
- Use unique DSNs per cluster for filtering in Uptrace
- Add cluster identifiers using the
attributesprocessor - Deploy identical Operator versions across clusters
- Use GitOps (ArgoCD, Flux) to manage configurations
Troubleshooting
- Check Operator pods:
kubectl logs -n opentelemetry-operator-system -l control-plane=controller-manager
- Verify collector status:
kubectl get otelcol
kubectl describe otelcol <name>
- Ensure RBAC and cert-manager are correctly configured.
- Confirm network connectivity to Kubernetes API and telemetry backends.
Common mistakes:
- Forgetting to replace
<FIXME>in DSN- Missing RBAC permissions
- Pods not restarting after adding annotations
- cert-manager webhook not ready
- Collector CrashLoopBackOff due to incorrect batch or receiver configuration
Backend Examples
This guide uses Uptrace in examples, but OpenTelemetry works with any OTLP-compatible backend. Here are quick configuration examples for other platforms:
Grafana Cloud:
exporters:
otlp:
endpoint: otlp-gateway.grafana.net:443
headers:
authorization: "Bearer YOUR_TOKEN"
Jaeger:
exporters:
otlp:
endpoint: jaeger-collector:4317
tls:
insecure: true
Datadog:
exporters:
otlp:
endpoint: trace.agent.datadoghq.com:4317
headers:
dd-api-key: "YOUR_API_KEY"
Prometheus (metrics):
exporters:
prometheus:
endpoint: "0.0.0.0:8889"