Getting Started with the OpenTelemetry Collector

OpenTelemetry Collector is a high-performance, scalable, and reliable data collection pipeline for observability data. It receives telemetry data from various sources, performs processing and translation to a common format, and then exports the data to various backends for storage and analysis.

Otel Collector supports multiple data formats, protocols, and platforms, making it a flexible and scalable solution for observability needs.

How OpenTelemetry Collector works?

OpenTelemetry Collector serves as a vendor-agnostic proxy between your applications and distributed tracing tools such as Uptrace or Jaeger.

The main stages of OpenTelemetry Collector operation are:

  1. Receiving data from various sources
  2. Processing and normalizing the data
  3. Exporting to different backends for storage and analysis

OpenTelemetry Collector operates by receiving telemetry data from various sources, processing and normalizing the data, and then exporting it to different backends for storage and analysis.

flowchart TD subgraph Receivers otlp_receiver(OTLP) prometheus_receiver(Prometheus) hostmetrics redis end subgraph Processors extensions((Extensions health, pprof, zpages)) batch1(Batch) ---> dots1(...) dots1 --> attrs(Attributes) batch2(Batch) ---> dots2(...) dots2 --> filters(Filters) end subgraph Exporters otlp_exporter(OTLP) prometheus_exporter(Prometheus) backend[(Tracing tools Uptrace, Jaeger)] end Receivers --> Processors Processors --> Exporters

Otel Collector provides powerful data processing capabilities, including aggregation, filtering, sampling, and enrichment of telemetry data. You can transform and reshape the data to fit your specific monitoring and analysis requirements before sending it to the backend systems.

Otel Collector is written in Go and licensed under Apache 2.0, which allows you to modify the source code and install custom extensions. However, this comes with the responsibility of maintaining your own OpenTelemetry Collector instances.

When to use OpenTelemetry Collector?

While sending telemetry data directly to a backend is often sufficient, deploying OpenTelemetry Collector alongside your services offers several advantages:

otelcol vs otelcol-contrib

OpenTelemetry Collector has 2 repositories on GitHub:

  • opentelemetry-collector is the core that contains only the most crucial components. It is distributed as otelcol binary.
  • opentelemetry-collector-contrib contains the core and all additional available components, for example, Redis and PostgreSQL receivers. It is distributed as otelcol-contrib binary.

You should always install and use the otelcol-contrib, because it is as stable as the core and supports more features.

Installation

OpenTelemetry Collector provides pre-compiled binaries for Linux, MacOS, and Windows. Here's how to install it on Linux:

Linux

To install otelcol-contrib binary with the associated systemd service, run the following command replacing 0.118.0 with the desired version and amd64 with the desired architecture:

shell Debian
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.118.0/otelcol-contrib_0.118.0_linux_amd64.deb
sudo dpkg -i otelcol-contrib_0.118.0_linux_amd64.deb
shell RPM
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.118.0/otelcol-contrib_0.118.0_linux_amd64.rpm
sudo rpm -ivh otelcol-contrib_0.118.0_linux_amd64.rpm

You can check the status of the installed service with:

shell
sudo systemctl status otelcol-contrib

And check the logs with:

shell
sudo journalctl -u otelcol-contrib -f

You can edit the config at /etc/otelcol-contrib/config.yaml and restart OpenTelemetry Collector:

shell
sudo systemctl restart otelcol-contrib

Compiling from sources

You can also compile OpenTelemetry Collector locally:

shell
git clone https://github.com/open-telemetry/opentelemetry-collector-contrib.git
cd opentelemetry-collector-contrib
make install-tools
make otelcontribcol
./bin/otelcontribcol_linux_amd64 --config ./examples/local/otel-config.yaml

Configuration

OpenTelemetry Collector is highly configurable, allowing you to customize its behavior and integrate it into your observability stack. It provides configuration options for specifying inputs, processors, and exporters, enabling you to tailor the agent to your specific needs.

By default, you can find the config file at /etc/otelcol-contrib/config.yaml, for example:

yaml
# receivers configure how data gets into the Collector.
receivers:
  otlp:
    protocols:
      grpc:
      http:

# processors specify what happens with the received data.
processors:
  resourcedetection:
    detectors: [env, system]
  cumulativetodelta:
  batch:
    send_batch_size: 10000
    timeout: 10s

# exporters configure how to send processed data to one or more backends.
exporters:
  otlp/uptrace:
    endpoint: api.uptrace.dev:4317
    headers:
      uptrace-dsn: '<FIXME>'

# service.pipelines pull the configured receivers, processors, and exporters together into
# pipelines that process data.
#
# receivers, processors, and exporters that are not used in pipelines are silently ignored.
service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp/uptrace]
    metrics:
      receivers: [otlp]
      processors: [cumulativetodelta, batch, resourcedetection]
      exporters: [otlp/uptrace]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp/uptrace]

You can always learn more about Otel Collector using the official documentation.

Troubleshooting

If otelcol is not working as expected, you can check the log output for potential issues. The logging verbosity level defaults to INFO, but you can change it using the configuration file:

yaml
service:
  telemetry:
    logs:
      level: 'debug'

To view the logs for potential issues:

shell
sudo journalctl -u otelcol-contrib -f

You can also enable metrics to monitor OpenTelemetry Collector:

yaml
receivers:
  prometheus/otelcol:
    config:
      scrape_configs:
        - job_name: 'otelcol'
          scrape_interval: 10s
          static_configs:
            - targets: ['0.0.0.0:8888']

service:
  telemetry:
    metrics:
      address: ':8888'
  pipelines:
    metrics/hostmetrics:
      receivers: [prometheus/otelcol]
      processors: [cumulativetodelta, batch, resourcedetection]
      exporters: [otlp/uptrace]

Extensions

Extensions provide additional capabilities for OpenTelemetry Collector and do not require direct access to telemetry data, for example, Health Check extension responds to health check requests.

yaml
extensions:
  # Health Check extension responds to health check requests
  health_check:
  # PProf extension allows fetching Collector's performance profile
  pprof:
  # zPages extension enables in-process diagnostics
  zpages:
  # Memory Ballast extension configures memory ballast for the process
  memory_ballast:
    size_mib: 512

Prometheus integration

For Prometheus integration, see OpenTelemetry Collector Prometheus.

Host Metrics

For information on host metrics, see OpenTelemetry host metrics.

Exporting data to Uptrace

For instructions on sending data from Otel Collector to Uptrace, see Sending data from Otel Collector to Uptrace.

Resource Detection

To detect resource information from the host, Otel Collector comes with resourcedetection processor.

Resource Detection Processor automatically detects and labels metadata about the environment in which the data was generated. Such metadata, known as "resources", provides context to the telemetry data and can include information such as the host, service, container, and cloud provider.

For example, to detect host.name and os.type attributes, you can use system detector:

yaml
processors:
  resourcedetection:
    detectors: [env, system]

service:
  pipelines:
    metrics:
      receivers: [otlp, hostmetrics]
      processors: [batch, resourcedetection]
      exporters: [otlp/uptrace]

To add custom attributes such as an IP address, you can use env variables with env detector:

shell
export OTEL_RESOURCE_ATTRIBUTES="instance=127.0.0.1"

To detect more information, you can use more specialized detectors, for example, if you are using Amazon EC2, you can use ec2 detector to also discover cloud.region and cloud.availability_zone attributes:

yaml
processors:
  resourcedetection/ec2:
    detectors: [env, ec2]

If you are using Google Cloud:

yaml
processors:
  resourcedetection/gcp:
    detectors: [env, gcp]

If you are using Docker:

yaml
processors:
  resourcedetection/docker:
    detectors: [env, docker]

You can check the official documentation to learn about available detectors for Heroku, Azure, Consul, and many others.

Memory Limiter

memorylimiterprocessor is a component that allows users to limit the amount of memory consumed by the OpenTelemetry Collector when processing telemetry data. It prevents the collector from using too much memory, which can lead to performance issues or even crashes.

Memory Limiter Processor works by periodically checking the amount of memory consumed by the OpenTelemetry Collector and comparing it to a user-defined memory limit. If the collector is using more memory than the specified limit, the processor will start dropping telemetry data until the memory usage falls below the limit.

To enable memory limiter:

yaml
processors:
  memory_limiter:
    check_interval: 1s
    limit_mib: 4000
    spike_limit_mib: 800

service:
  pipelines:
    metrics:
      processors: [memory_limiter]

Uptrace

Uptrace is an open source APM with an intuitive query builder, rich dashboards, alerting rules, and integrations for most languages and frameworks. It can process billions of spans and metrics on a single server and allows to monitor your applications at 10x lower cost.

Uptrace uses ClickHouse database to store traces, metrics, and logs. You can use it to monitor applications and set up automatic alerts to receive notifications via email, Slack, Telegram, and more.

You can get started with Uptrace by downloading a DEB/RPM package or a pre-compiled Go binary.