OpenTelemetry Kubernetes Monitoring

Vladimir Mihailenco
August 30, 2024
2 min read

Kubernetes has become the de facto standard for container orchestration and is widely adopted by organizations of all sizes.

By using OpenTelemetry in conjunction with Kubernetes, you can gain deep insights into the performance and behavior of your applications, monitor their health and resource usage, and troubleshoot issues more effectively.

What is OpenTelemetry Collector?

OpenTelemetry Collector is an agent that pulls telemetry data from systems you want to monitor and export the collected data to an OpenTelemetry backend.

Otel Collector provides powerful data processing capabilities, allowing you to perform aggregation, filtering, sampling, and enrichment of telemetry data. You can transform and reshape the data to fit your specific monitoring and analysis requirements before sending it to the backend systems.

Authentication

Otel Collector works by using the Kubernetes API to query and monitor the state of various Kubernetes resources.

To use the Kubernetes API, you need to configure an authentication method, for example, using a service account. This means that the Otel Collector must be running on the same K8s cluster, and the service account must have sufficient permissions to use the API.

Monitoring Kubernetes Cluster

OpenTelemetry Kubernetes Cluster receiver allows you to collect observability data from your Kubernetes cluster. It captures telemetry data about the cluster's nodes, pods, containers, and other resources, providing insights into the cluster's health, performance, and resource utilization.

To start monitoring your Kubernetes cluster, you need to configure the receiver in /etc/otel-contrib-collector/config.yaml using your Uptrace DSN:

yaml
receivers:
  k8s_cluster:
    auth_type: serviceAccount

exporters:
  otlp:
    endpoint: api.uptrace.dev:4317
    headers: { 'uptrace-dsn': '<FIXME>' }

processors:
  resourcedetection:
    detectors: [env, system]
  cumulativetodelta:
  batch:
    timeout: 10s

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp]
    metrics:
      receivers: [otlp, k8s_cluster]
      processors: [cumulativetodelta, batch, resourcedetection]
      exporters: [otlp]

Don't forget to create a service account with sufficient permissions.

See Helm example and Otelcol documentation for more details.

Monitoring Kubernetes Pods

OpenTelemetry Collector also provides OpenTelemetry Kubelet Stats receiver for collecting metrics from Kubelet, which is the primary node agent that runs on each node in a Kubernetes cluster.

Although it's possible to use kubernetes' hostNetwork feature to talk to the Kubelet api from a pod, the preferred approach is to use the downward API and a service account.

Make sure the pod spec sets the node name as follows:

yaml
env:
  - name: K8S_NODE_NAME
    valueFrom:
      fieldRef:
        fieldPath: spec.nodeName

Then the otel config can reference the K8S_NODE_NAME environment variable:

yaml
receivers:
  kubeletstats:
    auth_type: 'serviceAccount'
    endpoint: 'https://${env:K8S_NODE_NAME}:10250'
    insecure_skip_verify: true

exporters:
  otlp:
    endpoint: api.uptrace.dev:4317
    headers: { 'uptrace-dsn': '<FIXME>' }

processors:
  resourcedetection:
    detectors: [env, system]
  cumulativetodelta:
  batch:
    timeout: 10s

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp]
    metrics:
      receivers: [otlp, kubeletstats]
      processors: [cumulativetodelta, batch, resourcedetection]
      exporters: [otlp]

Don't forget to create a service account with sufficient permissions.

See Helm example and Otelcol documentation for more details.

OpenTelemetry Backend

Once the metrics are collected and exported, you can visualize them using a compatible backend system. For example, you can use Uptrace to create dashboards that display metrics from the OpenTelemetry Collector.

Uptrace is a DataDog competitor that supports distributed tracing, metrics, and logs. You can use it to monitor applications and troubleshoot issues.

Uptrace comes with an intuitive query builder, rich dashboards, alerting rules with notifications, and integrations for most languages and frameworks.

Uptrace can process billions of spans and metrics on a single server and allows you to monitor your applications at 10x lower cost.

In just a few minutes, you can try Uptrace by visiting the cloud demo (no login required) or running it locally with Docker. The source code is available on GitHub.

What's next?

Next, you can learn more about configuring OpenTelemetry Collector to export data to a backend.