OpenTelemetry Architecture

OpenTelemetry standardizes the collection and exchange of telemetry data across different programming languages and frameworks. It enables developers to instrument their code to generate telemetry data including traces, metrics, and logs.

What is OpenTelemetry?

OpenTelemetry is an open-source observability framework hosted by the Cloud Native Computing Foundation. It resulted from merging the OpenCensus and OpenTracing projects. See OpenTelemetry vs OpenTracing to learn how they compare.

OpenTelemetry streamlines application instrumentation for observability and provides a standardized approach to telemetry data collection. By promoting consistency across programming languages, frameworks, and environments, it enables developers to easily collect and analyze telemetry data for monitoring, debugging, and optimizing distributed systems.

Architecture Overview

OpenTelemetry's architecture provides a standardized approach to collecting, transmitting, and processing telemetry data from applications and services. It consists of several key components that work together to enable observability in distributed systems.

Observability Signals

OpenTelemetry captures four primary observability signals that work together to provide comprehensive system visibility:

Metrics indicate when there is a problem
Traces show where the problem is located
Logs help identify the root cause
Baggage carries contextual information across service boundaries

Additionally, two emerging signals are under active development:

Events - a specific type of log for recording discrete events
Profiles - performance profiling data (currently in development by the Profiling Working Group)

Metrics

OpenTelemetry Metrics are quantitative measurements that help quantify system behavior. They provide information about the current state or rate of various aspects of an application, such as CPU usage, memory consumption, or request latency. OpenTelemetry enables you to define and record custom metrics to monitor your application's performance and health.

Traces

OpenTelemetry Traces provide detailed records of requests' execution paths as they flow through distributed systems. They capture timing information for individual operations and their relationships, allowing you to understand request flows, identify bottlenecks, and troubleshoot performance issues. With OpenTelemetry, you can instrument your code to generate distributed traces and correlate them across services.

Logs

OpenTelemetry Logs are textual records of events or messages that occur during application execution. They help you understand application behavior, diagnose problems, and audit activities. OpenTelemetry provides mechanisms to capture structured logs from your application and enrich them with contextual information.

Baggage

Baggage is contextual information that is passed between signals and services in a distributed system. Unlike trace context (which carries trace and span IDs), baggage allows you to propagate arbitrary key-value pairs across service boundaries.

Baggage is useful for:

Cross-cutting concerns: Propagating user IDs, tenant IDs, or feature flags across services
Correlation: Enriching telemetry with business context without changing service interfaces
Dynamic behavior: Making downstream services aware of upstream decisions

Important: Baggage is transmitted in-band with requests (typically in HTTP headers), so keep baggage small to avoid performance overhead. Baggage data is not automatically included in telemetry—you must explicitly read baggage values and add them as attributes to your spans, metrics, or logs.

Core Components

Instrumentation Libraries

OpenTelemetry provides libraries for various programming languages that developers use to instrument their applications and collect telemetry data.

OpenTelemetry instrumentations are plugins for popular frameworks and libraries that use the OpenTelemetry API to record important operations such as HTTP requests, database queries, logs, and errors.

OpenTelemetry SDK

The OpenTelemetry SDK (Software Development Kit) is a collection of libraries and tools that enable developers to instrument applications and collect telemetry data for monitoring purposes.

The SDK provides a standardized and extensible framework for integrating OpenTelemetry into various programming languages and environments.

Exporters

OpenTelemetry exporters send telemetry data to external systems or backends for storage, analysis, and visualization.

OpenTelemetry offers various exporters that support different protocols and formats for exporting telemetry data. These exporters enable seamless integration with your preferred monitoring and analysis tools.

OpenTelemetry Protocol (OTLP)

OpenTelemetry Protocol (OTLP) is an open-source, vendor-neutral protocol for collecting, transmitting, and exporting telemetry data from software systems and applications.

OTLP defines the wire format and structure of data exchanged between instrumented applications and backend systems. The current specification version is v1.9.0, with stable support for traces, metrics, and logs.

Protocol Specification

OTLP specifies:

Encoding format: Protocol Buffers (binary) or JSON
Data schema: Standardized structure for metrics, traces, logs, and profiles
Transport rules: How data moves across networks
Error handling: Retryable vs non-retryable errors, partial success responses
Compression: Support for gzip compression

Transport Mechanisms

OTLP supports two primary transport options:

gRPC Transport:

Default port: 4317
Uses unary request/response pattern
Provides high throughput with concurrent requests
Preferred for performance-critical deployments

HTTP Transport:

Default port: 4318
Supports HTTP/1.1 and HTTP/2
Allows both binary Protobuf and JSON encoding
Implements retry with exponential backoff
Easier to use through firewalls and proxies

The OTLP exporter transmits collected telemetry data to backends for processing and analysis. Most modern observability backends support OTLP natively, making it the recommended export format.

Context Propagation

Context propagation ensures that relevant contextual data—such as trace IDs, span IDs, and other metadata—is consistently propagated across different services and application components.

By propagating context, OpenTelemetry ensures that telemetry collected from different services and components remains correlated, even in distributed and microservices architectures. This enables end-to-end tracing, making it easier to understand request flows, identify performance bottlenecks, and analyze system dependencies.

Resources

Resource attributes are key-value pairs that provide metadata about monitored entities such as services, processes, or containers. They identify resources and provide additional information for filtering and grouping telemetry data.

By including resource information in telemetry data, OpenTelemetry enables better analysis, visualization, and understanding of system behavior. Resources help correlate and contextualize telemetry data from different sources, providing a comprehensive view of observed applications or services.

OpenTelemetry Collector

The OpenTelemetry Collector plays a critical role in the OpenTelemetry ecosystem by providing a flexible and scalable solution for collecting and processing telemetry data.

The Collector acts as a centralized intermediary, simplifying data collection complexity and enabling flexible integration with different backends and systems.

OpenTelemetry in Kubernetes

OpenTelemetry provides powerful observability capabilities that seamlessly integrate into Kubernetes environments, offering comprehensive monitoring and tracing for containerized applications and microservices.

OpenTelemetry in Kubernetes enables you to:

Automatically inject OpenTelemetry instrumentation into your pods
Collect metrics, traces, and logs from both applications and Kubernetes infrastructure
Use the OpenTelemetry Collector to process and export data to your preferred backend (such as Uptrace)
Implement distributed tracing across microservices in your cluster
Monitor the health and performance of Kubernetes nodes, pods, and services

OpenTelemetry Operator

The OpenTelemetry Operator is a Kubernetes-native solution for managing and configuring OpenTelemetry instrumentation in your cluster. It simplifies the deployment and management of OpenTelemetry components by:

Automating the injection of OpenTelemetry auto-instrumentation into applications
Managing the lifecycle of OpenTelemetry Collector instances
Providing custom resource definitions (CRDs) for easy configuration of OpenTelemetry components
Enabling dynamic updates to your OpenTelemetry setup without redeploying applications
Facilitating the collection of Kubernetes-specific metadata to enrich telemetry data

The OpenTelemetry Operator streamlines the implementation of observability in Kubernetes, making it easier to gain insights into containerized applications and infrastructure.

Backend Systems

OpenTelemetry does not include a built-in backend or storage system for processing and analyzing data. Instead, it provides flexibility to select and integrate with various backend systems based on your specific needs and preferences.

Backend systems receive exported telemetry data from instrumented applications or the OpenTelemetry Collector. These systems store, analyze, and visualize the telemetry data.

See OpenTelemetry backends for details.

Conclusion

OpenTelemetry's architecture promotes flexibility, interoperability, and extensibility, enabling developers and operators to adopt observability practices that suit their specific requirements and environments.

ExportersComplete guide to OpenTelemetry Collector exporters. Configure OTLP, Uptrace, Jaeger, Prometheus, Zipkin, AWS X-Ray exporters for traces, metrics, and logs with production examples, retry configuration, and troubleshooting.

OperatorStep-by-step guide to deploying OpenTelemetry Operator in Kubernetes, setting up collectors, and enabling auto-instrumentation for full observability.