OpenTelemetry Operator for Kubernetes

⚡ If you have already followed our guide on Monitoring Kubernetes with OpenTelemetry Collector, this article will show you how to automate the deployment and instrumentation of your applications using the OpenTelemetry Operator.
Learn how to easily manage Collectors, enable auto-instrumentation, and scale observability in Kubernetes without manually editing YAML files.

What is OpenTelemetry Operator

The OpenTelemetry Operator simplifies deployment and management of OpenTelemetry components in Kubernetes. It automates the creation of Collectors, enables auto-instrumentation for workloads, and manages configurations using Kubernetes-native CRDs.

It leverages the Kubernetes Operator pattern to watch custom resources and maintain the desired state of your observability stack.

Official Operator repository

How the Operator Works

The Operator observes resources and automatically manages telemetry collection:

OpenTelemetryCollector: Configures collectors for metrics, logs, and traces.
Instrumentation: Injects language-specific agents (Java, Python, Node.js, .NET, Go) into applications.
Annotations: Applied to pods to trigger auto-instrumentation.

Note: Starting with Kubernetes 1.29+, the Operator automatically uses native sidecar containers for better pod lifecycle management. This feature ensures sidecars start before the main container and shut down after it. The usage of native sidecars can be disabled with --feature-gates=-sidecarcontainers.native if needed.

When to use OpenTelemetry Operator?

Feature	Operator	Manual Deployment
Collector deployment	Automated via CRDs	Manual YAML files
Auto-instrumentation	Yes (Java, Python, Node.js, .NET, Go)	No - requires code changes
Configuration updates	Automatic reconciliation	Manual kubectl/helm commands
Multi-collector management	Simplified with CRDs	Complex YAML management
Best for	Dynamic environments, auto-instrumentation needs	Static setups, full control required

The Operator is recommended when:

You need auto-instrumentation without modifying application code
Managing multiple collectors across different namespaces
Want GitOps-friendly declarative configuration

Manual deployment is better when:

Simple single-collector setup
Need maximum control over every configuration detail
Using custom deployment patterns not supported by Operator

Prerequisites

Kubernetes 1.27+
kubectl access with cluster admin permissions
Helm 3.16+ (if using Helm)

cert-manager installed for admission webhooks:

bash

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.18.2/cert-manager.yaml

Installing the Operator

To install OpenTelemetry Operator in your Kubernetes cluster, you can use Helm or kubectl. Here's how to install it:

Using kubectl

bash

kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml

Learn more about OpenTelemetry Operator installation options

Using Helm

bash

helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
helm install opentelemetry-operator open-telemetry/opentelemetry-operator \
  --namespace opentelemetry-operator-system \
  --create-namespace

Verify installation:

bash

kubectl get pods -n opentelemetry-operator-system

Helm Documentation: Installing Charts

Helm Chart Configuration

The Operator Helm chart provides various configuration options to customize your deployment.

💡 Complete Observability Stack: For production deployments, explore our
Uptrace Helm Charts which provide a
complete solution including Uptrace APM, OpenTelemetry Operator & Collector,
ClickHouse storage, PostgreSQL, and Redis — everything needed for production
Kubernetes deployments. See the deployment guide
for step-by-step instructions.

Custom Values

Create a values.yaml file to customize the installation:

yaml

# values.yaml
manager:
  resources:
    limits:
      cpu: 200m
      memory: 256Mi
    requests:
      cpu: 100m
      memory: 128Mi

  # Restrict operator to specific namespaces
  env:
    WATCH_NAMESPACE: "production,staging"

admissionWebhooks:
  create: true
  certManager:
    enabled: true

kubernetesClusterDomain: cluster.local

Install with custom values:

bash

helm install opentelemetry-operator open-telemetry/opentelemetry-operator \
  --namespace opentelemetry-operator-system \
  --create-namespace \
  --values values.yaml

Common Configuration Options

Parameter	Description	Default
`manager.replicas`	Number of operator replicas	`1`
`manager.resources`	Resource limits and requests	See values.yaml
`manager.env.WATCH_NAMESPACE`	Limit operator to specific namespaces	`""` (all)
`admissionWebhooks.certManager.enabled`	Use cert-manager for webhook certificates	`true`
`admissionWebhooks.autoGenerateCert.enabled`	Auto-generate self-signed certificates	`false`
`kubernetesClusterDomain`	Cluster domain name	`cluster.local`

Note: When using manager.env.WATCH_NAMESPACE, you can specify multiple namespaces separated by commas (e.g., "production,staging"). Leave empty to watch all namespaces.

Full Helm Chart reference

Deploying a Collector

💡 Backend Configuration: This guide uses Uptrace as an example backend.
OpenTelemetry is vendor-neutral - you can use Jaeger, Grafana Cloud, Datadog,
Prometheus, or any OTLP-compatible platform.
See other backend examples below.

Define a Collector using CRDs:

yaml

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: otel-collector
spec:
  mode: deployment
  replicas: 2
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    processors:
      batch:
        timeout: 10s
        send_batch_size: 512
      resourcedetection:
        detectors: [env, system, k8snode]
    exporters:
      otlp/uptrace:
        endpoint: api.uptrace.dev:4317
        headers:
          uptrace-dsn: '<FIXME>'
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [otlp/uptrace]
        metrics:
          receivers: [otlp]
          processors: [resourcedetection, batch]
          exporters: [otlp/uptrace]

Apply:

bash

kubectl apply -f collector.yaml
kubectl get otelcol

Note: Adjust batch.timeout and send_batch_size in processors for high-throughput production workloads.

For file-based logs, see our guide on Filelog Receiver

Auto-Instrumentation

One of the powerful features of OpenTelemetry Operator is its ability to automatically instrument applications in your Kubernetes cluster. Enable by adding annotations to deployments:

yaml Java

metadata:
  annotations:
    instrumentation.opentelemetry.io/inject-java: "true"

Supported languages: Java, Python, Node.js, .NET, Go.

Important:
Create the Instrumentation CRD before adding annotations to pods (see Custom Resource Definitions section)
The OpenTelemetry Collector must be deployed and running before instrumentation works
Restart pods after adding annotations for auto-instrumentation to take effect: kubectl rollout restart deployment/<your-app>
Verify instrumentation by checking pod events: kubectl describe pod <pod-name>

Custom Resource Definitions

The Operator introduces several CRDs for managing OpenTelemetry components.

OpenTelemetryCollector CRD

Complete example with all common fields:

yaml

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: example-collector
spec:
  mode: deployment  # deployment, daemonset, statefulset, or sidecar
  replicas: 2
  image: otel/opentelemetry-collector-contrib:0.136.0

  # Environment variables
  env:
    - name: MY_POD_IP
      valueFrom:
        fieldRef:
          fieldPath: status.podIP

  # Resource configuration
  resources:
    limits:
      cpu: 500m
      memory: 512Mi
    requests:
      cpu: 100m
      memory: 128Mi

  # Volume mounts
  volumeMounts:
    - name: config
      mountPath: /conf

  volumes:
    - name: config
      configMap:
        name: extra-config

  # Collector configuration
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:

    processors:
      batch:
      memory_limiter:
        limit_mib: 512

    exporters:
      otlp/uptrace:
        endpoint: api.uptrace.dev:4317
        headers:
          uptrace-dsn: ''

    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [memory_limiter, batch]
          exporters: [otlp/uptrace]

Instrumentation CRD

Configure auto-instrumentation for multiple languages:

yaml

apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: my-instrumentation
spec:
  exporter:
    endpoint: http://otel-collector:4317

  propagators:
    - tracecontext
    - baggage

  sampler:
    type: parentbased_traceidratio
    argument: "1"

  # Language-specific configurations
  java:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
    env:
      - name: OTEL_JAVAAGENT_DEBUG
        value: "false"

  python:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest

  nodejs:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest

  dotnet:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-dotnet:latest

List Available CRDs

bash

kubectl get crd | grep opentelemetry.io

Output:

text

instrumentations.opentelemetry.io
opentelemetrycollectors.opentelemetry.io
opampbridges.opentelemetry.io

Check resource status:

bash

kubectl get otelcol example-collector -o yaml
kubectl describe instrumentation my-instrumentation

Kubernetes Cluster & Monitoring

You can combine Operator with OpenTelemetry Kubernetes receivers (k8scluster, kubeletstats) for complete metrics:

Node metrics: CPU, memory, disk, network
Pod metrics: Resource usage, restarts, phase
Cluster events: Deployments, scaling, health
Application metrics: Latency, throughput, errors

Deployment Patterns

DaemonSet collects node-level metrics and pod traces:

yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: otel-collector-daemonset

Deployment collects cluster-wide metrics:

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: otel-collector-deployment

Multi-Cluster Setup

Deploy the Operator in each cluster using the main Collector configuration as a base.

Cluster-Specific Parameters

Adjust these parameters for each cluster:

Parameter	Production	Staging	Development
`replicas`	3	2	1
`namespace`	`observability`	`observability`	`dev-observability`
`cluster.name`	`production`	`staging`	`development`
`uptrace-dsn`	`<PROD_DSN>`	`<STAGING_DSN>`	`<DEV_DSN>`

Add Cluster Identifier

Add this processor to your collector config:

yaml

processors:
  batch:
  attributes:  # ← Add this processor
    actions:
      - key: cluster.name
        value: production  # Change per cluster
        action: insert

Update the pipeline to include the processor:

yaml

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [attributes, batch]  # ← Add attributes here
      exporters: [otlp/uptrace]

Best Practices:

Use unique DSNs per cluster for filtering in Uptrace
Add cluster identifiers using the attributes processor
Deploy identical Operator versions across clusters
Use GitOps (ArgoCD, Flux) to manage configurations

Troubleshooting

Check Operator pods:

bash

kubectl logs -n opentelemetry-operator-system -l control-plane=controller-manager

Verify collector status:

bash

kubectl get otelcol
kubectl describe otelcol <name>

Ensure RBAC and cert-manager are correctly configured.
Confirm network connectivity to Kubernetes API and telemetry backends.

Common mistakes:
Forgetting to replace <FIXME> in DSN
Missing RBAC permissions
Pods not restarting after adding annotations
cert-manager webhook not ready
Collector CrashLoopBackOff due to incorrect batch or receiver configuration

Backend Examples

This guide uses Uptrace in examples, but OpenTelemetry works with any OTLP-compatible backend. Here are quick configuration examples for other platforms:

Grafana Cloud:

yaml

exporters:
  otlp:
    endpoint: otlp-gateway.grafana.net:443
    headers:
      authorization: "Bearer YOUR_TOKEN"

Jaeger:

yaml

exporters:
  otlp:
    endpoint: jaeger-collector:4317
    tls:
      insecure: true

Datadog:

yaml

exporters:
  otlp:
    endpoint: trace.agent.datadoghq.com:4317
    headers:
      dd-api-key: "YOUR_API_KEY"

Prometheus (metrics):

yaml

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"

Next Steps

ArchitectureOpenTelemetry architecture overview covering core components (API, SDK, Collector), observability signals (traces, metrics, logs, baggage), OTLP protocol, and W3C context propagation with architecture diagrams.

Env VariablesMaster OpenTelemetry environment variables for production deployments. Complete guide covering OTEL_EXPORTER_OTLP_ENDPOINT, sampling, batching, SSL/TLS, and signal-specific configuration with practical examples.