OpenTelemetry Logs [complete guide]

OpenTelemetry Logs provide a standardized way to collect, correlate, and export log data alongside metrics and traces, creating a unified observability solution.

Quick Reference

Log Record FieldTypeDescriptionRequired
Timestampuint64Time when log occurred (Unix nanoseconds)Yes
SeverityenumLog level (TRACE, DEBUG, INFO, WARN, ERROR)No
BodyanyLog message or structured dataNo
AttributesmapKey-value pairs (user.id, http.method, etc.)No
TraceIdbytes16-byte trace identifier for correlationNo
SpanIdbytes8-byte span identifier for correlationNo
TraceFlagsbyteW3C trace flags (01 = sampled)No

Common Severity Levels:

SeverityNumber RangeUse Case
TRACE1-4Very detailed debugging information
DEBUG5-8Debugging information
INFO9-12Informational messages
WARN13-16Warning messages
ERROR17-20Error events
FATAL21-24Critical errors causing shutdown

Why OpenTelemetry Logs?

Logs are essential for understanding application behavior, diagnosing issues, and monitoring system health. While distributed tracing and metrics provide valuable insights into system performance, logs offer detailed context and information about specific events, errors, and application behavior.

The three pillars of observability:

  • Metrics: Tell you that there's a problem (e.g., error rate increased)
  • Traces: Show you where the problem is (e.g., which service is slow)
  • Logs: Explain why it happened (e.g., "database connection timeout after 30s")

Benefits of OpenTelemetry Logs:

  1. Automatic correlation: Logs are automatically linked to traces and spans using trace_id and span_id
  2. Structured data: Consistent key-value format across all services
  3. Unified pipeline: Same infrastructure (OpenTelemetry Collector) for logs, metrics, and traces
  4. Vendor-neutral: Switch backends without changing instrumentation
  5. Rich context: Automatically includes resource attributes (service name, version, host, etc.)

Overview

OpenTelemetry is an open source observability framework that provides standardized APIs, libraries, and instrumentation for collecting telemetry from applications and systems.

The OpenTelemetry Logs specification defines how to collect, process, and export log data. While OpenTelemetry is primarily focused on collecting metrics and traces, it also supports log collection through its logging API. OpenTelemetry embraces existing logging solutions and ensures that it works well with existing logging libraries and log collection tools.

OpenTelemetry provides a logging API that allows you to instrument your applications and generate structured logs. The OpenTelemetry Logging API is designed to work with other telemetry data, such as metrics and traces, to provide a unified observability solution.

Log Record Structure:

According to the data model specification, each log record contains:

  • Timestamp: When the log occurred (nanoseconds since Unix epoch)
  • ObservedTimestamp: When the log was observed by the collection system
  • TraceId and SpanId: For automatic correlation with distributed traces
  • SeverityNumber: Numeric representation of log level (1-24)
  • SeverityText: Human-readable severity (TRACE, DEBUG, INFO, WARN, ERROR, FATAL)
  • Body: The log message or structured data
  • Resource: Information about the source (service name, host, version)
  • Attributes: Key-value pairs for additional context

OpenTelemetry emphasizes structured logging, allowing you to attach additional contextual information to log entries through attributes or metadata. This allows you to include relevant details such as timestamps, request IDs, user IDs, correlation IDs, and other custom context that can aid in log analysis and troubleshooting.

flowchart TD app(Application) instrumentations(Instrumentation Libraries) bridge(OpenTelemetry Logs Bridge API) backend[(Backends Uptrace, Jaeger)] subgraph libraries [Logging libraries] log4j Zap end subgraph api [Logs API] java(Java) dotnet(.NET) go(Go) python(Python) ruby(Ruby) end subgraph sdk [Logs SDK] Processing entrichment("Enrichment (trace_id, span_id)") Batching end subgraph exporters [Exporters] otlp(OTLP) jaeger(Jaeger) end instrumentations --> libraries app --> libraries libraries --> bridge bridge --> api api --> sdk sdk --> exporters exporters --> backend

Different types of logs

OpenTelemetry supports capturing logs from various sources within an application or system. Depending on how logs are generated and collected, logs can be grouped into 3 categories.

System and infrastructure logs

System logs provide valuable information about system operation, performance, and security. System logs are typically generated by various components within the system, including the operating system, applications, network devices, and servers.

System logs are written at the host level and have a predefined format and content that can't be easily changed. System logs don't include information about the trace context.

Legacy first-party logs

First-party logs are generated by in-house applications and record specific application events, errors, and user activities. These logs are useful for application debugging and troubleshooting.

Typically, developers can modify these applications to change how logs are written and what information is included. For example, to correlate logs with traces, developers can manually add the trace context to each log statement or do it automatically using a plugin for their logging library.

For example, to propagate context and associate a log record with a span, you can use the following attributes in the log message:

  • trace_id for TraceId, hex-encoded.
  • span_id for SpanId, hex-encoded.
  • trace_flags for trace flags, formatted according to W3C traceflags format.

For example:

text
request failed trace_id=958180131ddde684c1dbda1aeacf51d3 span_id=0cf859e4f7510204

New first-party logs

When starting a new project, you can follow OpenTelemetry's recommendations and best practices about how to emit logs using auto-instrumentation or configuring your logging library to use an OpenTelemetry log appender.

OpenTelemetry's logging API allows developers to instrument their applications to produce structured logs that can be collected and processed by logging backends or log management systems. The logging API provides a way to attach additional contextual information to log entries, such as tags, attributes, or metadata.

Use the Logger API to log events or messages at different severity levels, such as debug, info, warn, error, and so on. You can also attach additional attributes or context to the log entries to provide more information.

OpenTelemetry also provides a standardized approach to propagating context within logs across distributed systems. This ensures that the relevant execution context is consistently captured and preserved, even when logs are generated by different components of the system.

OpenTelemetry Logs data model allows to include TraceId and SpanId directly in LogRecords.

Structured Logging Best Practices

Structured logging uses a consistent, parseable format (typically JSON) with key-value pairs instead of unstructured text messages. This makes logs easier to search, filter, and analyze.

Structured vs Unstructured Logs

Unstructured logging (harder to query):

text
2024-01-15 10:30:15 ERROR User john.doe@example.com failed login from IP 192.168.1.100 after 3 attempts
2024-01-15 10:30:16 INFO Request to /api/users completed in 245ms

Structured logging (easy to query and analyze):

json
{
  "timestamp": "2024-01-15T10:30:15Z",
  "level": "ERROR",
  "message": "User login failed",
  "user.email": "john.doe@example.com",
  "client.ip": "192.168.1.100",
  "auth.attempts": 3,
  "trace_id": "5b8efff798038103d269b633813fc60c",
  "span_id": "eee19b7ec3c1b174"
}

Key Fields to Include

Always include:

  • timestamp: ISO 8601 format with timezone
  • level: DEBUG, INFO, WARN, ERROR, FATAL
  • message: Human-readable description
  • trace_id and span_id: For correlation with traces
  • service.name: Which service emitted the log

Commonly useful:

  • user.id: For user-specific issues
  • request.id: For request tracking
  • error.type and error.message: For errors
  • duration: For performance-related logs
  • resource attributes: host.name, service.version, deployment.environment

Implementation Examples

go Go (Zap)
import (
    "go.uber.org/zap"
    "go.opentelemetry.io/otel/trace"
)

logger, _ := zap.NewProduction()

// Structured logging with trace context
func handleRequest(ctx context.Context) {
    span := trace.SpanFromContext(ctx)
    spanCtx := span.SpanContext()

    logger.Info("processing request",
        zap.String("trace_id", spanCtx.TraceID().String()),
        zap.String("span_id", spanCtx.SpanID().String()),
        zap.String("user.id", "12345"),
        zap.String("http.method", "GET"),
        zap.String("http.route", "/api/users/:id"),
        zap.Int("http.status_code", 200),
        zap.Duration("duration", 245*time.Millisecond),
    )
}
python Python (structlog)
import structlog
from opentelemetry import trace

logger = structlog.get_logger()

def handle_request():
    span = trace.get_current_span()
    span_ctx = span.get_span_context()

    logger.info(
        "processing request",
        trace_id=format(span_ctx.trace_id, '032x'),
        span_id=format(span_ctx.span_id, '016x'),
        user_id="12345",
        http_method="GET",
        http_route="/api/users/:id",
        http_status_code=200,
        duration_ms=245
    )
js Node.js (winston)
const winston = require('winston');
const { trace } = require('@opentelemetry/api');

const logger = winston.createLogger({
  format: winston.format.json(),
  transports: [new winston.transports.Console()]
});

function handleRequest() {
  const span = trace.getActiveSpan();
  const spanCtx = span.spanContext();

  logger.info('processing request', {
    trace_id: spanCtx.traceId,
    span_id: spanCtx.spanId,
    user_id: '12345',
    http_method: 'GET',
    http_route: '/api/users/:id',
    http_status_code: 200,
    duration_ms: 245
  });
}
java Java (Logback)
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;
import io.opentelemetry.api.trace.Span;

Logger logger = LoggerFactory.getLogger(MyClass.class);

public void handleRequest() {
    Span span = Span.current();
    SpanContext spanCtx = span.getSpanContext();

    // Add trace context to MDC
    MDC.put("trace_id", spanCtx.getTraceId());
    MDC.put("span_id", spanCtx.getSpanId());

    logger.info("Processing request user.id={} http.method={} duration={}ms",
        "12345", "GET", 245);

    MDC.clear();
}

Structured Logging Guidelines

DO:

  • Use consistent key names across services (follow semantic conventions)
  • Use appropriate data types (numbers as numbers, not strings)
  • Include trace context in every log
  • Use hierarchical key names with dots: http.method, db.statement
  • Keep log messages concise and actionable

DON'T:

  • Don't log sensitive data (passwords, tokens, PII) without redaction
  • Don't use variable-length data as keys (use values instead)
  • Don't duplicate information in both message and fields
  • Don't log at inappropriate levels (DEBUG in production can be expensive)

Log-Trace Correlation

One of OpenTelemetry's most powerful features is automatic correlation between logs and traces. This allows you to jump from a trace span to all logs emitted during that span, or vice versa.

How Correlation Works

When you emit a log within an active trace span, OpenTelemetry automatically includes:

  • trace_id: Links log to the entire distributed trace
  • span_id: Links log to the specific operation
  • trace_flags: Indicates if the trace is sampled

This creates a bidirectional link:

  • From trace to logs: View all logs that occurred during a span
  • From log to trace: Jump to the full trace context for a log entry

Automatic Correlation with Log Bridges

go Go (otelzap)
import (
    "go.uber.org/zap"
    "go.opentelemetry.io/contrib/bridges/otelzap"
    "go.opentelemetry.io/otel"
)

// Create logger bridge
core := otelzap.NewCore("my-service")
logger := zap.New(core)

// Logs automatically include trace context
func handleRequest(ctx context.Context) {
    ctx, span := tracer.Start(ctx, "handle_request")
    defer span.End()

    // This log is automatically correlated with the span
    logger.Ctx(ctx).Info("processing started",
        zap.String("user.id", "12345"),
    )

    // Do work...

    logger.Ctx(ctx).Info("processing completed")
}
python Python
from opentelemetry import trace
from opentelemetry._logs import set_logger_provider
from opentelemetry.sdk._logs import LoggerProvider, LoggingHandler
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor
import logging

# Set up OpenTelemetry logging
logger_provider = LoggerProvider()
set_logger_provider(logger_provider)
logger_provider.add_log_record_processor(
    BatchLogRecordProcessor(exporter)
)

# Attach to Python logging
handler = LoggingHandler(logger_provider=logger_provider)
logging.getLogger().addHandler(handler)

# Logs automatically correlated
tracer = trace.get_tracer(__name__)

def handle_request():
    with tracer.start_as_current_span("handle_request"):
        # Automatically includes trace context
        logging.info("Processing started", extra={"user.id": "12345"})
        # Do work...
        logging.info("Processing completed")
js Node.js (winston)
const winston = require('winston');
const { trace, context } = require('@opentelemetry/api');

const logger = winston.createLogger({
  format: winston.format.combine(
    winston.format((info) => {
      const span = trace.getActiveSpan();
      if (span) {
        const spanContext = span.spanContext();
        info.trace_id = spanContext.traceId;
        info.span_id = spanContext.spanId;
        info.trace_flags = spanContext.traceFlags;
      }
      return info;
    })(),
    winston.format.json()
  ),
  transports: [new winston.transports.Console()]
});

function handleRequest() {
  const span = tracer.startSpan('handle_request');
  context.with(trace.setSpan(context.active(), span), () => {
    // Logs automatically include trace context
    logger.info('Processing started', { user_id: '12345' });
    // Do work...
    logger.info('Processing completed');
    span.end();
  });
}
java Java (Logback)
// logback.xml configuration:
// <pattern>%d{HH:mm:ss.SSS} trace_id=%X{trace_id} span_id=%X{span_id} - %msg%n</pattern>

import io.opentelemetry.api.trace.Span;
import io.opentelemetry.context.Context;
import org.slf4j.MDC;

// Use OpenTelemetry Java agent for automatic MDC injection
// Or manually inject:
Span span = Span.current();
SpanContext spanContext = span.getSpanContext();

MDC.put("trace_id", spanContext.getTraceId());
MDC.put("span_id", spanContext.getSpanId());

logger.info("Processing request");

MDC.clear();

Manual Correlation (Legacy Logs)

If you can't use OpenTelemetry log bridges, manually inject trace context:

go Go
func logWithContext(ctx context.Context, msg string) {
    span := trace.SpanFromContext(ctx)
    if span.SpanContext().IsValid() {
        spanCtx := span.SpanContext()
        log.Printf("%s trace_id=%s span_id=%s",
            msg,
            spanCtx.TraceID().String(),
            spanCtx.SpanID().String(),
        )
    } else {
        log.Printf("%s", msg)
    }
}
python Python
from opentelemetry import trace
import logging

def log_with_context(msg):
    span = trace.get_current_span()
    span_ctx = span.get_span_context()
    if span_ctx.is_valid:
        logging.info(
            f"{msg} trace_id={format(span_ctx.trace_id, '032x')} "
            f"span_id={format(span_ctx.span_id, '016x')}"
        )
    else:
        logging.info(msg)
js Node.js
const { trace } = require('@opentelemetry/api');

function logWithContext(msg) {
  const span = trace.getActiveSpan();
  if (span) {
    const spanCtx = span.spanContext();
    console.log(
      `${msg} trace_id=${spanCtx.traceId} span_id=${spanCtx.spanId}`
    );
  } else {
    console.log(msg);
  }
}

Querying Correlated Data

Once logs are correlated with traces, you can:

Find all logs for a trace:

sql
SELECT * FROM logs
WHERE trace_id = '5b8efff798038103d269b633813fc60c'
ORDER BY timestamp

Find traces related to error logs:

sql
SELECT DISTINCT trace_id FROM logs
WHERE level = 'ERROR'
  AND timestamp > now() - interval '1 hour'

View logs within a specific span:

sql
SELECT * FROM logs
WHERE span_id = 'eee19b7ec3c1b174'
  AND trace_id = '5b8efff798038103d269b633813fc60c'

Using the OpenTelemetry Logs API

While log bridges (integrating existing logging libraries) are the most common approach, you can also use the OpenTelemetry Logs API directly.

When to Use the Logs API Directly

  • Building a new application without legacy logging dependencies
  • Need maximum performance (fewer abstraction layers)
  • Want direct control over log record structure
  • Implementing custom logging behavior

Logs API Examples

go Go
import (
    "go.opentelemetry.io/otel/log"
    "go.opentelemetry.io/otel/log/global"
)

// Get logger
logger := global.LoggerProvider().Logger("my-service")

func handleRequest(ctx context.Context) {
    // Emit log record
    logger.Emit(ctx, log.Record{
        Severity:  log.SeverityInfo,
        Body:      log.StringValue("Request processed successfully"),
        Attributes: []log.KeyValue{
            log.String("user.id", "12345"),
            log.String("http.method", "GET"),
            log.Int("http.status_code", 200),
            log.Int64("duration_ms", 245),
        },
    })
}

// Error logging
func handleError(ctx context.Context, err error) {
    logger.Emit(ctx, log.Record{
        Severity: log.SeverityError,
        Body:     log.StringValue("Request failed"),
        Attributes: []log.KeyValue{
            log.String("error.type", "DatabaseError"),
            log.String("error.message", err.Error()),
        },
    })
}
python Python
from opentelemetry._logs import get_logger_provider
from opentelemetry.sdk._logs import LoggerProvider
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor

# Set up logger provider
logger_provider = LoggerProvider()
get_logger_provider().add_log_record_processor(
    BatchLogRecordProcessor(exporter)
)

# Get logger
logger = logger_provider.get_logger("my-service")

# Emit log
logger.emit(
    body="Request processed successfully",
    severity_number=9,  # INFO
    attributes={
        "user.id": "12345",
        "http.method": "GET",
        "http.status_code": 200,
        "duration_ms": 245
    }
)
js JavaScript
const { logs } = require('@opentelemetry/api-logs');

const logger = logs.getLogger('my-service');

function handleRequest() {
  logger.emit({
    severityNumber: 9, // INFO
    body: 'Request processed successfully',
    attributes: {
      'user.id': '12345',
      'http.method': 'GET',
      'http.status_code': 200,
      'duration_ms': 245
    }
  });
}

Log Severity Levels

OpenTelemetry defines standard severity levels:

SeverityNumberUse Case
TRACE1-4Very detailed debugging information
DEBUG5-8Debugging information
INFO9-12Informational messages
WARN13-16Warning messages
ERROR17-20Error events
FATAL21-24Critical errors causing shutdown

Best Practices

What to Log

DO log:

  • Application errors: Exceptions, validation failures, timeouts
  • Business events: User registration, order placement, payment completion
  • Performance markers: Slow operations, cache misses, retry attempts
  • Security events: Authentication failures, authorization denials
  • State changes: Configuration updates, feature flag changes
  • External service calls: API requests, database queries (without sensitive data)

DON'T log:

  • Passwords or credentials: Never log authentication secrets
  • Personal Identifiable Information (PII): Email, phone numbers, addresses (unless required and compliant)
  • Payment card data: Credit card numbers, CVV codes
  • Session tokens or API keys: Authentication tokens should never be logged
  • Excessive debug information in production: Can impact performance and storage

Security Considerations

Redact sensitive data:

go Go
// ❌ Bad: Logging sensitive data
logger.Info("User login",
    zap.String("email", email),
    zap.String("password", password),  // Never log passwords!
)

// ✅ Good: Redact sensitive fields
logger.Info("User login",
    zap.String("email", maskEmail(email)),  // user@example.com → u***@example.com
    zap.String("user.id", userID),
)

func maskEmail(email string) string {
    parts := strings.Split(email, "@")
    if len(parts) != 2 {
        return "***"
    }
    return string(parts[0][0]) + "***@" + parts[1]
}
python Python
# ❌ Bad: Logging sensitive data
logger.info("User login", extra={
    "email": email,
    "password": password  # Never log passwords!
})

# ✅ Good: Redact sensitive fields
logger.info("User login", extra={
    "email": mask_email(email),  # user@example.com → u***@example.com
    "user_id": user_id
})

def mask_email(email):
    parts = email.split("@")
    if len(parts) != 2:
        return "***"
    return parts[0][0] + "***@" + parts[1]
js Node.js
// ❌ Bad: Logging sensitive data
logger.info('User login', {
  email: email,
  password: password  // Never log passwords!
});

// ✅ Good: Redact sensitive fields
logger.info('User login', {
  email: maskEmail(email),  // user@example.com → u***@example.com
  user_id: userId
});

function maskEmail(email) {
  const parts = email.split('@');
  if (parts.length !== 2) {
    return '***';
  }
  return parts[0][0] + '***@' + parts[1];
}

Use appropriate log levels:

  • Production: INFO and above
  • Staging: DEBUG and above
  • Development: TRACE/DEBUG and above

Performance Considerations

1. Avoid expensive operations in log statements:

go Go
// ❌ Bad: Expensive operation always runs
logger.Debug("User data: " + expensiveJSONMarshal(user))

// ✅ Good: Check level first
if logger.Level() <= zap.DebugLevel {
    logger.Debug("User data", zap.String("user_json", expensiveJSONMarshal(user)))
}
python Python
# ❌ Bad: Expensive operation always runs
logger.debug(f"User data: {expensive_json_marshal(user)}")

# ✅ Good: Check level first
if logger.isEnabledFor(logging.DEBUG):
    logger.debug("User data", extra={"user_json": expensive_json_marshal(user)})
js Node.js
// ❌ Bad: Expensive operation always runs
logger.debug(`User data: ${expensiveJSONMarshal(user)}`);

// ✅ Good: Check level first
if (logger.isDebugEnabled()) {
  logger.debug('User data', { user_json: expensiveJSONMarshal(user) });
}

2. Use sampling for high-frequency logs:

go Go
var logCounter int64

func handleRequest() {
    count := atomic.AddInt64(&logCounter, 1)

    // Only log every 100th request
    if count % 100 == 0 {
        logger.Info("Sampled request log", zap.Int64("request_count", count))
    }
}
python Python
import threading

log_counter = 0
counter_lock = threading.Lock()

def handle_request():
    global log_counter
    with counter_lock:
        log_counter += 1
        count = log_counter

    # Only log every 100th request
    if count % 100 == 0:
        logger.info("Sampled request log", extra={"request_count": count})
js Node.js
let logCounter = 0;

function handleRequest() {
  logCounter++;

  // Only log every 100th request
  if (logCounter % 100 === 0) {
    logger.info('Sampled request log', { request_count: logCounter });
  }
}

3. Batch log exports:

go Go
// Configure batch processor
processor := log.NewBatchProcessor(
    exporter,
    log.WithMaxQueueSize(2048),
    log.WithExportTimeout(30*time.Second),
    log.WithBatchSize(512),
)
python Python
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor

# Configure batch processor
processor = BatchLogRecordProcessor(
    exporter,
    max_queue_size=2048,
    export_timeout_millis=30000,
    max_export_batch_size=512
)
js Node.js
const { BatchLogRecordProcessor } = require('@opentelemetry/sdk-logs');

// Configure batch processor
const processor = new BatchLogRecordProcessor(exporter, {
  maxQueueSize: 2048,
  exportTimeoutMillis: 30000,
  maxExportBatchSize: 512
});

Log Retention and Storage

Set appropriate retention policies:

  • Hot storage (quick access): 7-30 days
  • Warm storage (slower access): 30-90 days
  • Cold storage (archive): 90+ days or compliance requirements

Control log volume:

go
// Use Views to drop verbose debug logs in production
if environment == "production" {
    // Configure to only export INFO and above
}

Troubleshooting

Logs Not Appearing in Backend

Problem: Logs are generated but don't reach the observability backend.

Diagnosis:

bash
# 1. Check if logs are being generated locally
tail -f /var/log/app.log

# 2. Verify OpenTelemetry Collector is running
docker ps | grep otel-collector

# 3. Check collector logs for errors
docker logs otel-collector

# 4. Test connectivity to backend
curl -v https://api.uptrace.dev/v1/traces

Common causes:

  • Exporter not configured correctly
  • Network connectivity issues
  • Backend endpoint incorrect
  • Authentication credentials invalid
  • Firewall blocking OTLP port (typically 4317 or 4318)

Solutions:

yaml
# Verify collector configuration
exporters:
  otlp/uptrace:
    endpoint: api.uptrace.dev:4317  # Correct endpoint
    headers:
      uptrace-dsn: "https://TOKEN@api.uptrace.dev/PROJECT_ID"  # Valid DSN

# Enable debug logging in collector
service:
  telemetry:
    logs:
      level: debug

# Test with console exporter first
exporters:
  logging:
    loglevel: debug

service:
  pipelines:
    logs:
      receivers: [otlp]
      exporters: [logging]  # See logs in collector output

Missing Trace Correlation

Problem: Logs appear but don't link to traces.

Common causes:

  • Trace context not propagated to logging framework
  • Log bridge not configured
  • Sampling causing traces to be dropped while logs are retained

Solutions:

go Go
// Ensure you're passing context to logger
logger.Ctx(ctx).Info("message")  // ✅ Correct
logger.Info("message")             // ❌ No context
python Python
# Ensure logging handler is configured
from opentelemetry.sdk._logs import LoggingHandler
logging.getLogger().addHandler(LoggingHandler())

Verify trace context in logs:

bash
# Check if trace_id is present in log output
grep "trace_id" /var/log/app.log

Performance Overhead

Problem: Logging causes application slowdown.

Diagnosis:

go Go
// Measure logging overhead
start := time.Now()
logger.Info("test message", zap.String("key", "value"))
fmt.Printf("Log operation took: %v\n", time.Since(start))
python Python
import time

# Measure logging overhead
start = time.time()
logger.info("test message", extra={"key": "value"})
print(f"Log operation took: {time.time() - start:.6f}s")
js Node.js
// Measure logging overhead
const start = Date.now();
logger.info('test message', { key: 'value' });
console.log(`Log operation took: ${Date.now() - start}ms`);

Solutions:

  1. Use asynchronous exporters:
go Go
processor := log.NewBatchProcessor(exporter)  // Async batching
// vs
processor := log.NewSimpleProcessor(exporter)  // Synchronous
python Python
from opentelemetry.sdk._logs.export import BatchLogRecordProcessor, SimpleLogRecordProcessor

processor = BatchLogRecordProcessor(exporter)  # Async batching
# vs
processor = SimpleLogRecordProcessor(exporter)  # Synchronous
js Node.js
const { BatchLogRecordProcessor, SimpleLogRecordProcessor } = require('@opentelemetry/sdk-logs');

const processor = new BatchLogRecordProcessor(exporter);  // Async batching
// vs
const processor = new SimpleLogRecordProcessor(exporter);  // Synchronous
  1. Reduce log level in production:
go Go
// Only ERROR and above in production
if env == "production" {
    logger = logger.WithLevel(zap.ErrorLevel)
}
python Python
import logging

# Only ERROR and above in production
if env == "production":
    logger.setLevel(logging.ERROR)
js Node.js
// Only ERROR and above in production
if (env === 'production') {
  logger.level = 'error';
}
  1. Sample high-frequency logs (shown in Best Practices section)
  2. Optimize attribute count:
go Go
// ❌ Too many attributes
logger.Info("msg", zap.String("a", "1"), zap.String("b", "2"), ..., zap.String("z", "26"))

// ✅ Only essential attributes
logger.Info("msg", zap.String("error.type", "timeout"), zap.Int("duration_ms", 5000))
python Python
# ❌ Too many attributes
logger.info("msg", extra={"a": "1", "b": "2", ..., "z": "26"})

# ✅ Only essential attributes
logger.info("msg", extra={"error.type": "timeout", "duration_ms": 5000})
js Node.js
// ❌ Too many attributes
logger.info('msg', { a: '1', b: '2', ..., z: '26' });

// ✅ Only essential attributes
logger.info('msg', { 'error.type': 'timeout', duration_ms: 5000 });

Parsing Errors in Collector

Problem: Collector can't parse log format.

Common causes:

  • JSON parsing errors
  • Timestamp format not recognized
  • Incorrect receiver configuration

Solutions:

yaml
receivers:
  filelog:
    include: [/var/log/app/*.log]
    operators:
      # Parse JSON logs
      - type: json_parser
        parse_from: body

      # Parse timestamp
      - type: time_parser
        parse_from: attributes.timestamp
        layout: '%Y-%m-%d %H:%M:%S.%f'  # Match your format

      # Handle parsing errors
      - type: regex_parser
        regex: '^(?P<timestamp>.*?) (?P<level>.*?) (?P<message>.*)'
        on_error: drop  # or 'send' to forward unparsed logs

Test parsing:

bash
# Use collector in debug mode to see parsing issues
./otelcol --config=config.yaml --set=service.telemetry.logs.level=debug

Timestamp Issues

Problem: Logs appear with incorrect timestamps.

Solutions:

yaml
receivers:
  filelog:
    operators:
      # Specify correct timezone
      - type: time_parser
        parse_from: attributes.time
        layout: '%Y-%m-%d %H:%M:%S'
        location: America/New_York  # Set your timezone

# Or use UTC everywhere
receivers:
  syslog:
    protocol: rfc3164
    location: UTC  # Consistent timezone

High Storage Costs

Problem: Log storage costs are too high.

Solutions:

  1. Implement sampling:
go Go
// Sample verbose logs
if level == DEBUG && rand.Float64() > 0.01 {  // Keep 1%
    return
}
python Python
import random

# Sample verbose logs
if level == logging.DEBUG and random.random() > 0.01:  # Keep 1%
    return
js Node.js
// Sample verbose logs
if (level === 'debug' && Math.random() > 0.01) {  // Keep 1%
  return;
}
  1. Set shorter retention:
yaml
# In your backend
retention:
  logs: 7d  # 7 days for logs
  traces: 30d  # 30 days for traces
  1. Filter unnecessary logs in collector:
yaml
processors:
  filter:
    logs:
      exclude:
        match_type: regexp
        record_attributes:
          - key: message
            value: "^health check.*"  # Drop health check logs
  1. Aggregate similar logs:
yaml
processors:
  groupbyattrs:
    keys:
      - level
      - error.type

OpenTelemetry Collector

OpenTelemetry Collector is a flexible and scalable agent for collecting, processing, and exporting telemetry data. It simplifies the task of receiving and managing telemetry data from multiple sources and enables the export of data to multiple backends or observability systems.

OpenTelemetry Collector supports multiple log sources, including application logs, log files, logging libraries, and third-party logging systems. It provides integrations with popular logging frameworks and libraries, enabling seamless ingestion of log data.

Collector provides the ability to transform and enrich log data. You can modify log attributes, add metadata, or enrich logs with additional contextual information to enhance their value and make them more meaningful for analysis and troubleshooting.

Once collected and processed, OpenTelemetry Collector can export log data to various logging backends or systems. It supports exporting logs to popular logging platforms, storage systems, or log management tools for long-term storage, analysis, and visualization.

Examples

Tailing a simple json file

To collect and send JSON logs to Uptrace, add the following to your OpenTelemetry Collector configuration file:

yaml
receivers:
  filelog:
    include: [/var/log/myservice/*.json]
    operators:
      - type: json_parser
        timestamp:
          parse_from: attributes.time
          layout: '%Y-%m-%d %H:%M:%S'

processors:
  batch:

exporters:
  otlp/uptrace:
    endpoint: api.uptrace.dev:4317
    headers:
      uptrace-dsn: '<FIXME>'

service:
  pipelines:
    logs:
      receivers: [filelog]
      processors: [batch]
      exporters: [otlp/uptrace]

See filelogreceiver for details.

Syslog

To collect and send syslog logs to Uptrace, add the following to your OpenTelemetry Collector configuration file:

yaml
receivers:
  syslog:
    tcp:
      listen_address: '0.0.0.0:54527'
    protocol: rfc3164
    location: UTC # specify server timezone here
    operators:
      - type: move
        from: attributes.message
        to: body

processors:
  batch:

exporters:
  otlp/uptrace:
    endpoint: api.uptrace.dev:4317
    headers:
      uptrace-dsn: '<FIXME>'

service:
  pipelines:
    logs:
      receivers: [syslog]
      processors: [batch]
      exporters: [otlp/uptrace]

Then you need to configure Rsyslog to forward logs to the OpenTelemetry Collector. You can achieve that by adding the following to the end of /etc/rsyslog.conf file:

text
*.* action(type="omfwd" target="0.0.0.0" port="54527" protocol="tcp"
           action.resumeRetryCount="10"
           queue.type="linkedList" queue.size="10000")

Lastly, restart the Rsyslog service:

shell
sudo systemctl restart rsyslog.service

See syslogreceiver for details.

Kubernetes Logs

To collect and send Kubernetes logs to Uptrace, add the following to your OpenTelemetry Collector configuration file:

yaml
receivers:
  filelog:
    include:
      - /var/log/pods/*/*/*.log
    include_file_name: false
    include_file_path: true
    start_at: beginning
    operators:
      - id: get-format
        routes:
          - expr: body matches "^\\{"
            output: parser-docker
          - expr: body matches "^[^ Z]+ "
            output: parser-crio
          - expr: body matches "^[^ Z]+Z"
            output: parser-containerd
            type: router
      - id: parser-crio
        output: extract_metadata_from_filepath
        regex: ^(?P<time>[^ Z]+) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
        timestamp:
          layout: 2006-01-02T15:04:05.999999999Z07:00
          layout_type: gotime
          parse_from: attributes.time
          type: regex_parser
      - id: parser-containerd
        output: extract_metadata_from_filepath
        regex: ^(?P<time>[^ ^Z]+Z) (?P<stream>stdout|stderr) (?P<logtag>[^ ]*) ?(?P<log>.*)$
        timestamp:
          layout: '%Y-%m-%dT%H:%M:%S.%LZ'
          parse_from: attributes.time
          type: regex_parser
      - id: parser-docker
        output: extract_metadata_from_filepath
        timestamp:
          layout: '%Y-%m-%dT%H:%M:%S.%LZ'
          parse_from: attributes.time
          type: json_parser
      - id: extract_metadata_from_filepath
        parse_from: attributes["log.file.path"]
        regex: ^.*\/(?P<namespace>[^_]+)_(?P<pod_name>[^_]+)_(?P<uid>[a-f0-9\-]+)\/(?P<container_name>[^\._]+)\/(?P<restart_count>\d+)\.log$
        type: regex_parser
      - from: attributes.stream
        to: attributes["log.iostream"]
        type: move
      - from: attributes.container_name
        to: resource["k8s.container.name"]
        type: move
      - from: attributes.namespace
        to: resource["k8s.namespace.name"]
        type: move
      - from: attributes.pod_name
        to: resource["k8s.pod.name"]
        type: move
      - from: attributes.restart_count
        to: resource["k8s.container.restart_count"]
        type: move
      - from: attributes.uid
        to: resource["k8s.pod.uid"]
        type: move
      - from: attributes.log
        to: body
        type: move

processors:
  batch:

exporters:
  otlp/uptrace:
    endpoint: api.uptrace.dev:4317
    headers:
      uptrace-dsn: '<FIXME>'

service:
  pipelines:
    logs:
      receivers: [filelog]
      processors: [batch]
      exporters: [otlp/uptrace]

See filelogreceiver for details.

Kubernetes Events

To collect and send Kubernetes events to Uptrace, add the following to your OpenTelemetry Collector configuration file:

yaml
receivers:
  k8s_events:
    auth_type: serviceAccount

processors:
  batch:

exporters:
  otlp/uptrace:
    endpoint: api.uptrace.dev:4317
    headers:
      uptrace-dsn: '<FIXME>'

service:
  pipelines:
    logs:
      receivers: [k8s_events]
      processors: [batch]
      exporters: [otlp/uptrace]

See k8seventsreceiver for details.

Golang Slog

OpenTelemetry Slog is a bridge between the OpenTelemetry observability framework and the popular slog logging library in Go. It allows you to integrate your existing slog logs with OpenTelemetry traces and metrics, providing a unified view of your application's behavior.

The simplest OpenTelemetry Slog configuration looks like this:

go
import (
    "go.opentelemetry.io/otel/sdk/log"
    "go.opentelemetry.io/contrib/bridges/otelslog"
)

exp, err := stdoutlog.New()
if err != nil {
    panic(err)
}

processor := log.NewSimpleProcessor(exp)
provider := log.NewLoggerProvider(log.WithProcessor(processor))
defer provider.Shutdown(context.Background())

global.SetLoggerProvider(provider)

logger := otelslog.NewLogger("app_or_package_name")
logger.ErrorContext(ctx, "hello world", slog.String("error", "error message"))

See GitHub example for details.

OpenTelemetry Backend

Once the log data is exported to your logging backend, you can process and analyze the logs using the platform's features. This can include filtering, searching, aggregating, and visualizing the logs to gain insight into your application's behavior and troubleshoot issues.

See OpenTelemetry backend for the list of compatible backends.

Conclusion

Logs are an essential part of observability, and OpenTelemetry can work alongside logging frameworks and libraries to provide a comprehensive observability solution.