OpenTelemetry Ruby Metrics API

This document teaches you how to use the OpenTelemetry Ruby Metrics API to measure application performance with metrics. To learn how to install and configure the OpenTelemetry Ruby SDK, see Getting started with OpenTelemetry Ruby.

If you are not familiar with metrics terminology such as timeseries or additive/synchronous/asynchronous instruments, read the introduction to OpenTelemetry Metrics first.

Prerequisites

Before using the Metrics API, ensure you have the required packages installed:

bash
gem install opentelemetry-api opentelemetry-sdk
gem install opentelemetry-metrics-api opentelemetry-metrics-sdk

Or add to your Gemfile:

ruby
gem 'opentelemetry-api'
gem 'opentelemetry-sdk'
gem 'opentelemetry-metrics-api'
gem 'opentelemetry-metrics-sdk'

Getting Started

To get started with metrics, you need to create a meter:

ruby
require 'opentelemetry/metrics'

meter = OpenTelemetry.meter_provider.meter('my_app_or_gem', '1.0.0')

Using the meter, you can create instruments to measure performance. The simplest Counter instrument looks like this:

ruby
counter = meter.create_counter(
  'requests_total',
  description: 'Total number of requests processed',
  unit: '1'
)

1000.times do |i|
  counter.add(1, attributes: { 'status' => 'success', 'method' => 'GET' })

  if i % 10 == 0
    # Force collection for demonstration
    sleep(0.1)
  end
end

Metric Instruments

OpenTelemetry provides several types of instruments to capture different kinds of measurements. Each instrument serves a specific purpose and has distinct characteristics.

Counter

Counter is a synchronous instrument that measures additive non-decreasing values, representing cumulative totals like the number of requests, errors, or completed tasks.

ruby
require 'opentelemetry/metrics'

meter = OpenTelemetry.meter_provider.meter('my_app_or_gem', '1.0.0')

http_requests_counter = meter.create_counter(
  'http_requests_total',
  description: 'Total number of HTTP requests',
  unit: '1'
)

error_counter = meter.create_counter(
  'http_errors_total',
  description: 'Total number of HTTP errors',
  unit: '1'
)

def handle_request(method, endpoint, status_code)
  # Record successful request
  http_requests_counter.add(1, attributes: {
    'method' => method,
    'endpoint' => endpoint,
    'status_code' => status_code.to_s
  })

  # Record error if applicable
  if status_code >= 400
    error_type = status_code < 500 ? 'client_error' : 'server_error'
    error_counter.add(1, attributes: {
      'method' => method,
      'endpoint' => endpoint,
      'error_type' => error_type
    })
  end
end

# Example usage
handle_request('GET', '/api/users', 200)
handle_request('POST', '/api/users', 201)
handle_request('GET', '/api/users/999', 404)

UpDownCounter

UpDownCounter is a synchronous instrument that measures additive values that can both increase and decrease, such as the number of active connections or items in a queue.

ruby
active_connections = meter.create_up_down_counter(
  'database_connections_active',
  description: 'Number of active database connections',
  unit: '1'
)

queue_size = meter.create_up_down_counter(
  'task_queue_size',
  description: 'Number of items in the task queue',
  unit: '1'
)

class ConnectionPool
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('connection_pool', '1.0.0')
    @active_connections = @meter.create_up_down_counter(
      'connections_active',
      description: 'Active database connections',
      unit: '1'
    )
  end

  def acquire_connection(database_name)
    # Connection established
    @active_connections.add(1, attributes: {
      'database' => database_name,
      'pool' => 'main'
    })

    begin
      yield
    ensure
      # Connection released
      @active_connections.add(-1, attributes: {
        'database' => database_name,
        'pool' => 'main'
      })
    end
  end
end

class TaskQueue
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('task_queue', '1.0.0')
    @queue_size = @meter.create_up_down_counter(
      'queue_size',
      description: 'Number of items in queue',
      unit: '1'
    )
  end

  def enqueue(task, priority: 'normal')
    # Add task to queue
    perform_enqueue(task)

    @queue_size.add(1, attributes: {
      'queue' => 'tasks',
      'priority' => priority
    })
  end

  def dequeue(priority: 'normal')
    task = perform_dequeue

    if task
      @queue_size.add(-1, attributes: {
        'queue' => 'tasks',
        'priority' => priority
      })
    end

    task
  end

  private

  def perform_enqueue(task)
    # Queue implementation
  end

  def perform_dequeue
    # Queue implementation
  end
end

# Usage
pool = ConnectionPool.new
pool.acquire_connection('users_db') do
  # Database operations
end

queue = TaskQueue.new
queue.enqueue(task, priority: 'high')
task = queue.dequeue(priority: 'high')

Histogram

Histogram is a synchronous instrument that measures the statistical distribution of values, such as request latencies or response sizes, grouping them into buckets.

ruby
request_duration = meter.create_histogram(
  'http_request_duration_seconds',
  description: 'HTTP request duration in seconds',
  unit: 's'
)

response_size = meter.create_histogram(
  'http_response_size_bytes',
  description: 'HTTP response size in bytes',
  unit: 'by'
)

class HttpHandler
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('http_handler', '1.0.0')
    @request_duration = @meter.create_histogram(
      'request_duration_seconds',
      description: 'Request duration',
      unit: 's'
    )
    @response_size = @meter.create_histogram(
      'response_size_bytes',
      description: 'Response size',
      unit: 'by'
    )
  end

  def handle_request(method, endpoint)
    start_time = Time.now

    begin
      # Simulate request processing
      processing_time = rand(0.01..0.5)
      sleep(processing_time)

      # Simulate response
      response_data = 'x' * rand(100..5000)
      status_code = 200

      # Record metrics
      duration = Time.now - start_time
      @request_duration.record(duration, attributes: {
        'method' => method,
        'endpoint' => endpoint,
        'status_code' => status_code.to_s
      })

      @response_size.record(response_data.length, attributes: {
        'method' => method,
        'endpoint' => endpoint,
        'content_type' => 'application/json'
      })

      { status: status_code, body: response_data }
    rescue StandardError => e
      duration = Time.now - start_time
      @request_duration.record(duration, attributes: {
        'method' => method,
        'endpoint' => endpoint,
        'status_code' => '500'
      })
      raise
    end
  end
end

# Usage
handler = HttpHandler.new
response = handler.handle_request('GET', '/api/users')
response = handler.handle_request('POST', '/api/users')

Observable Gauge

Observable Gauge is an asynchronous instrument that measures non-additive values at a point in time, such as CPU usage, memory consumption, or temperature readings.

ruby
class SystemMetrics
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('system_metrics', '1.0.0')

    @system_gauge = @meter.create_observable_gauge(
      'system_resource_usage',
      description: 'System resource utilization',
      unit: '1',
      callback: method(:collect_system_metrics)
    )

    @app_gauge = @meter.create_observable_gauge(
      'application_metrics',
      description: 'Application-specific metrics',
      unit: '1',
      callback: method(:collect_app_metrics)
    )
  end

  private

  def collect_system_metrics
    measurements = []

    begin
      # CPU usage (requires 'sys-cpu' gem or similar)
      if defined?(Sys::CPU)
        cpu_usage = Sys::CPU.load_avg.first
        measurements << {
          value: cpu_usage,
          attributes: { 'resource' => 'cpu', 'unit' => 'percent' }
        }
      end

      # Memory usage (requires 'sys-proctable' gem or similar)
      if defined?(Sys::ProcTable)
        process = Sys::ProcTable.ps(pid: Process.pid).first
        if process
          memory_mb = process.rss / 1024 / 1024
          measurements << {
            value: memory_mb,
            attributes: { 'resource' => 'memory', 'unit' => 'megabytes' }
          }
        end
      end

      # Simple Ruby process info
      measurements << {
        value: Process.pid,
        attributes: { 'resource' => 'process', 'unit' => 'pid' }
      }

    rescue StandardError => e
      # Log error but don't fail metric collection
      puts "Error collecting system metrics: #{e.message}"
    end

    measurements
  end

  def collect_app_metrics
    measurements = []

    begin
      # Current timestamp
      measurements << {
        value: Time.now.to_f,
        attributes: { 'metric' => 'last_update', 'unit' => 'timestamp' }
      }

      # Active threads
      active_threads = Thread.list.count { |t| t.status == 'run' }
      measurements << {
        value: active_threads,
        attributes: { 'metric' => 'active_threads', 'unit' => 'count' }
      }

      # Object count (Ruby-specific)
      measurements << {
        value: ObjectSpace.count_objects[:TOTAL],
        attributes: { 'metric' => 'objects_total', 'unit' => 'count' }
      }

    rescue StandardError => e
      puts "Error collecting app metrics: #{e.message}"
    end

    measurements
  end
end

# Usage
system_metrics = SystemMetrics.new
# Metrics will be collected automatically by the SDK

Observable Counter

Observable Counter is an asynchronous instrument that measures monotonically increasing values, such as total bytes read or CPU time consumed.

ruby
class ProcessMetrics
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('process_metrics', '1.0.0')

    @process_counter = @meter.create_observable_counter(
      'process_resource_usage',
      description: 'Process resource usage counters',
      unit: '1',
      callback: method(:collect_process_metrics)
    )

    @io_counter = @meter.create_observable_counter(
      'process_io_usage',
      description: 'Process I/O usage counters',
      unit: '1',
      callback: method(:collect_io_metrics)
    )
  end

  private

  def collect_process_metrics
    measurements = []

    begin
      # Process times
      times = Process.times
      measurements << {
        value: times.utime,
        attributes: { 'cpu_type' => 'user', 'unit' => 'seconds' }
      }
      measurements << {
        value: times.stime,
        attributes: { 'cpu_type' => 'system', 'unit' => 'seconds' }
      }

      # Ruby VM stats
      if GC.respond_to?(:stat)
        gc_stats = GC.stat
        measurements << {
          value: gc_stats[:count] || 0,
          attributes: { 'resource' => 'gc_runs', 'unit' => 'count' }
        }
        measurements << {
          value: gc_stats[:total_allocated_objects] || 0,
          attributes: { 'resource' => 'allocated_objects', 'unit' => 'count' }
        }
      end

    rescue StandardError => e
      puts "Error collecting process metrics: #{e.message}"
    end

    measurements
  end

  def collect_io_metrics
    measurements = []

    begin
      # File descriptors (Unix-like systems only)
      if RUBY_PLATFORM !~ /mswin|mingw|cygwin/
        begin
          fd_count = Dir.glob("/proc/#{Process.pid}/fd/*").length
          measurements << {
            value: fd_count,
            attributes: { 'resource' => 'file_descriptors', 'unit' => 'count' }
          }
        rescue Errno::ENOENT, Errno::EACCES
          # /proc not available or accessible
        end
      end

      # Ruby-specific: loaded features count
      measurements << {
        value: $LOADED_FEATURES.length,
        attributes: { 'resource' => 'loaded_features', 'unit' => 'count' }
      }

    rescue StandardError => e
      puts "Error collecting I/O metrics: #{e.message}"
    end

    measurements
  end
end

# Usage
process_metrics = ProcessMetrics.new
# Metrics will be collected automatically

Observable UpDownCounter

Observable UpDownCounter is an asynchronous instrument that measures additive values that can increase or decrease, measured at observation time.

ruby
class QueueMetrics
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('queue_metrics', '1.0.0')

    # Simulated queue state
    @message_queues = {
      'email' => { size: 0, workers: 0 },
      'sms' => { size: 0, workers: 0 },
      'push' => { size: 0, workers: 0 }
    }

    @queue_gauge = @meter.create_observable_up_down_counter(
      'message_queue_status',
      description: 'Message queue status metrics',
      unit: '1',
      callback: method(:collect_queue_metrics)
    )

    @connection_gauge = @meter.create_observable_up_down_counter(
      'connection_pool_status',
      description: 'Connection pool status',
      unit: '1',
      callback: method(:collect_connection_metrics)
    )
  end

  def update_queue_size(queue_name, change)
    @message_queues[queue_name][:size] += change if @message_queues[queue_name]
  end

  def update_workers(queue_name, change)
    @message_queues[queue_name][:workers] += change if @message_queues[queue_name]
  end

  private

  def collect_queue_metrics
    measurements = []

    @message_queues.each do |queue_name, stats|
      # Queue size (can go up and down)
      measurements << {
        value: stats[:size],
        attributes: { 'queue' => queue_name, 'metric' => 'size' }
      }

      # Active workers (can go up and down)
      measurements << {
        value: stats[:workers],
        attributes: { 'queue' => queue_name, 'metric' => 'workers' }
      }
    end

    measurements
  end

  def collect_connection_metrics
    measurements = []

    # Simulate connection pool status
    pools = {
      'database' => { 'active' => 5, 'idle' => 3, 'max' => 10 },
      'redis' => { 'active' => 2, 'idle' => 8, 'max' => 10 },
      'elasticsearch' => { 'active' => 1, 'idle' => 4, 'max' => 5 }
    }

    pools.each do |pool_name, stats|
      stats.each do |state, count|
        measurements << {
          value: count,
          attributes: { 'pool' => pool_name, 'state' => state }
        }
      end
    end

    measurements
  end
end

# Usage
queue_metrics = QueueMetrics.new

# Simulate queue operations
queue_metrics.update_queue_size('email', 5)
queue_metrics.update_workers('email', 2)

# Metrics will be collected automatically by the SDK

Working with Attributes

Attributes provide contextual information that makes metrics more useful for analysis and filtering.

Adding Attributes to Measurements

ruby
# Create various counters and histograms
meter = OpenTelemetry.meter_provider.meter('api_service', '1.0.0')

api_requests = meter.create_counter('api_requests_total', description: 'Total API requests')
request_duration = meter.create_histogram('request_duration_seconds', description: 'Request duration')

class ApiHandler
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('api_handler', '1.0.0')
    @api_requests = @meter.create_counter('api_requests_total')
    @request_duration = @meter.create_histogram('request_duration_seconds')
  end

  def handle_request(method, endpoint, user_type, region)
    start_time = Time.now

    begin
      # Simulate request processing
      processing_time = rand(0.01..0.3)
      sleep(processing_time)

      # Record successful request with detailed attributes
      @api_requests.add(1, attributes: {
        'method' => method,
        'endpoint' => endpoint,
        'status_code' => '200',
        'user_type' => user_type,
        'region' => region,
        'cache_hit' => 'false'
      })

      # Record duration
      duration = Time.now - start_time
      @request_duration.record(duration, attributes: {
        'method' => method,
        'endpoint' => endpoint,
        'status_code' => '200'
      })

      { status: 200, message: 'Success' }
    rescue StandardError => e
      # Record error
      @api_requests.add(1, attributes: {
        'method' => method,
        'endpoint' => endpoint,
        'status_code' => '500',
        'user_type' => user_type,
        'region' => region,
        'error_type' => e.class.name
      })

      duration = Time.now - start_time
      @request_duration.record(duration, attributes: {
        'method' => method,
        'endpoint' => endpoint,
        'status_code' => '500'
      })

      raise
    end
  end
end

# Example usage
handler = ApiHandler.new
handler.handle_request('GET', '/api/users', 'premium', 'us-east-1')
handler.handle_request('POST', '/api/orders', 'free', 'eu-west-1')

Attribute Best Practices

Use meaningful attributes that provide valuable differentiation without creating excessive cardinality:

ruby
# Good: Low cardinality attributes
http_requests = meter.create_counter('http_requests_total')

def record_request(method, status_code, endpoint_category)
  """Record request with low-cardinality attributes"""
  http_requests.add(1, attributes: {
    'method' => method,                              # Limited values: GET, POST, PUT, DELETE
    'status_class' => "#{status_code / 100}xx",     # Grouped: 2xx, 3xx, 4xx, 5xx
    'endpoint_category' => endpoint_category         # Grouped: api, static, health
  })
end

def categorize_endpoint(endpoint)
  """Categorize endpoints to reduce cardinality"""
  case endpoint
  when %r{^/api/}
    'api'
  when %r{^/static/}
    'static'
  when '/health'
    'health'
  when %r{^/admin/}
    'admin'
  else
    'other'
  end
end

# Example usage
record_request('GET', 200, 'api')
record_request('POST', 201, 'api')

# Avoid: High cardinality attributes
def bad_example(method, status_code, full_url, user_id, timestamp)
  """Example of what NOT to do - high cardinality attributes"""
  # DON'T DO THIS - creates too many unique metric series
  http_requests.add(1, attributes: {
    'method' => method,
    'status_code' => status_code.to_s,    # 50+ possible values
    'full_url' => full_url,               # Thousands of unique URLs
    'user_id' => user_id,                 # Thousands of users
    'timestamp' => timestamp.to_s         # Infinite unique values
  })
  # This could create millions of unique metric series!
end

Recording Measurements

Synchronous Measurements

Synchronous instruments are recorded inline with application logic:

ruby
class OrderProcessor
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('order_processor', '1.0.0')

    @operation_counter = @meter.create_counter('operations_total')
    @operation_duration = @meter.create_histogram('operation_duration_seconds')
    @error_counter = @meter.create_counter('operation_errors_total')
  end

  def process_order(order_type, user_id)
    start_time = Time.now

    begin
      # Increment operation counter
      @operation_counter.add(1, attributes: {
        'operation' => order_type,
        'status' => 'started'
      })

      # Simulate operation
      processing_time = rand(0.1..1.0)
      sleep(processing_time)

      # Simulate potential failure
      if rand < 0.1  # 10% failure rate
        raise StandardError, 'Order processing failed'
      end

      # Record successful completion
      @operation_counter.add(1, attributes: {
        'operation' => order_type,
        'status' => 'completed'
      })

      "Order #{order_type} processed successfully"
    rescue StandardError => e
      # Record error
      @error_counter.add(1, attributes: {
        'operation' => order_type,
        'error_type' => e.class.name,
        'error_message' => e.message[0..50]  # Truncate to avoid high cardinality
      })

      @operation_counter.add(1, attributes: {
        'operation' => order_type,
        'status' => 'failed'
      })

      raise
    ensure
      # Always record duration
      duration = Time.now - start_time
      @operation_duration.record(duration, attributes: {
        'operation' => order_type
      })
    end
  end
end

# Example usage
processor = OrderProcessor.new

begin
  result = processor.process_order('premium_order', 'user123')
  puts result
rescue StandardError => e
  puts "Operation failed: #{e.message}"
end

Asynchronous Measurements

Asynchronous instruments use callbacks that are invoked during metric collection:

ruby
class AsyncMetricsCollector
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('async_collector', '1.0.0')

    # Global state for demonstration
    @system_stats = {
      'cpu_usage' => 0.0,
      'memory_usage' => 0.0,
      'disk_usage' => 0.0,
      'network_connections' => 0
    }

    @queue_stats = {
      'email' => { 'size' => 0, 'processed' => 0 },
      'sms' => { 'size' => 0, 'processed' => 0 },
      'push' => { 'size' => 0, 'processed' => 0 }
    }

    # Create observable instruments
    @system_gauge = @meter.create_observable_gauge(
      'system_metrics',
      description: 'System performance metrics',
      unit: '1',
      callback: method(:collect_system_metrics)
    )

    @queue_counter = @meter.create_observable_up_down_counter(
      'queue_metrics',
      description: 'Queue status metrics',
      unit: '1',
      callback: method(:collect_queue_metrics)
    )
  end

  def update_system_stats
    # Simulate system stats updates (in real app, this would call actual system APIs)
    @system_stats['cpu_usage'] = rand(0.0..100.0)
    @system_stats['memory_usage'] = rand(20.0..80.0)
    @system_stats['disk_usage'] = rand(10.0..90.0)
    @system_stats['network_connections'] = rand(10..100)
  end

  def simulate_queue_activity
    @queue_stats.each do |queue_name, stats|
      # Add items to queue
      new_items = rand(0..5)
      stats['size'] += new_items

      # Process items from queue
      processed = [stats['size'], rand(0..3)].min
      stats['size'] -= processed
      stats['processed'] += processed
    end
  end

  private

  def collect_system_metrics
    measurements = []

    update_system_stats

    @system_stats.each do |metric_name, value|
      measurements << {
        value: value,
        attributes: { 'metric' => metric_name }
      }
    end

    measurements
  end

  def collect_queue_metrics
    measurements = []

    simulate_queue_activity

    @queue_stats.each do |queue_name, stats|
      # Queue size (up/down counter)
      measurements << {
        value: stats['size'],
        attributes: { 'queue' => queue_name, 'metric' => 'size' }
      }

      # Total processed (counter)
      measurements << {
        value: stats['processed'],
        attributes: { 'queue' => queue_name, 'metric' => 'processed' }
      }
    end

    measurements
  end
end

# Usage
collector = AsyncMetricsCollector.new

# In a real application, you might run this in a background thread
# Thread.new do
#   loop do
#     sleep(30)  # Metrics will be collected automatically by the SDK
#   end
# end

Practical Examples

HTTP Server Metrics

ruby
class HttpServerMetrics
  """Comprehensive HTTP server metrics collection"""

  def initialize
    @meter = OpenTelemetry.meter_provider.meter('http_server', '1.0.0')

    @request_counter = @meter.create_counter(
      'http_requests_total',
      description: 'Total HTTP requests',
      unit: '1'
    )

    @request_duration = @meter.create_histogram(
      'http_request_duration_seconds',
      description: 'HTTP request duration',
      unit: 's'
    )

    @active_requests = @meter.create_up_down_counter(
      'http_requests_active',
      description: 'Active HTTP requests',
      unit: '1'
    )

    @response_size = @meter.create_histogram(
      'http_response_size_bytes',
      description: 'HTTP response size',
      unit: 'by'
    )

    @error_counter = @meter.create_counter(
      'http_errors_total',
      description: 'Total HTTP errors',
      unit: '1'
    )
  end

  def record_request(method, route, status_code, duration, response_size)
    """Record metrics for an HTTP request"""
    attributes = {
      'method' => method,
      'route' => route,
      'status_code' => status_code.to_s
    }

    # Record request count
    @request_counter.add(1, attributes: attributes)

    # Record duration
    @request_duration.record(duration, attributes: attributes)

    # Record response size
    @response_size.record(response_size, attributes: attributes)

    # Record errors
    if status_code >= 400
      error_attributes = attributes.merge({
        'error_type' => status_code < 500 ? 'client_error' : 'server_error'
      })
      @error_counter.add(1, attributes: error_attributes)
    end
  end

  def start_request(method, route)
    """Track active request start"""
    @active_requests.add(1, attributes: {
      'method' => method,
      'route' => route
    })
  end

  def end_request(method, route)
    """Track active request end"""
    @active_requests.add(-1, attributes: {
      'method' => method,
      'route' => route
    })
  end
end

# Rack middleware example
class MetricsMiddleware
  def initialize(app)
    @app = app
    @metrics = HttpServerMetrics.new
  end

  def call(env)
    request = Rack::Request.new(env)
    method = request.request_method
    route = extract_route(env)

    start_time = Time.now
    @metrics.start_request(method, route)

    begin
      status, headers, body = @app.call(env)

      # Calculate response size
      response_size = if body.respond_to?(:each)
                       body.sum { |chunk| chunk.bytesize }
                     else
                       body.to_s.bytesize
                     end

      # Record metrics
      duration = Time.now - start_time
      @metrics.record_request(method, route, status.to_i, duration, response_size)

      [status, headers, body]
    ensure
      @metrics.end_request(method, route)
    end
  end

  private

  def extract_route(env)
    # Extract route pattern, fallback to path
    env['HTTP_X_ROUTE_PATTERN'] || env['PATH_INFO'] || '/'
  end
end

# Usage with Sinatra
require 'sinatra'

use MetricsMiddleware

get '/users/:id' do
  # Your application logic
  { user_id: params[:id] }.to_json
end

Database Connection Pool Metrics

ruby
class DatabasePoolMetrics
  """Database connection pool metrics"""

  def initialize
    @meter = OpenTelemetry.meter_provider.meter('database_pool', '1.0.0')

    @query_counter = @meter.create_counter(
      'db_queries_total',
      description: 'Total database queries',
      unit: '1'
    )

    @query_duration = @meter.create_histogram(
      'db_query_duration_seconds',
      description: 'Database query duration',
      unit: 's'
    )

    # Simulated connection pool state
    @pool_stats = {
      'active' => 0,
      'idle' => 10,
      'max' => 20,
      'total_created' => 10,
      'total_closed' => 0
    }

    @connection_pool_gauge = @meter.create_observable_gauge(
      'db_connection_pool_status',
      description: 'Database connection pool status',
      unit: '1',
      callback: method(:collect_pool_metrics)
    )

    @mutex = Mutex.new
  end

  def execute_query(query_type, table)
    """Execute a database query with metrics"""
    start_time = Time.now

    # Simulate getting connection from pool
    @mutex.synchronize do
      if @pool_stats['idle'] > 0
        @pool_stats['idle'] -= 1
        @pool_stats['active'] += 1
      else
        # Would normally wait for connection or create new one
        sleep(0.01)  # Simulate wait
      end
    end

    begin
      # Simulate query execution
      execution_time = rand(0.001..0.1)
      sleep(execution_time)

      # Record metrics
      duration = Time.now - start_time
      attributes = {
        'query_type' => query_type,
        'table' => table
      }

      @query_counter.add(1, attributes: attributes)
      @query_duration.record(duration, attributes: attributes)

      "Query #{query_type} on #{table} completed"
    ensure
      # Return connection to pool
      @mutex.synchronize do
        @pool_stats['active'] -= 1
        @pool_stats['idle'] += 1
      end
    end
  end

  private

  def collect_pool_metrics
    measurements = []

    @mutex.synchronize do
      @pool_stats.each do |state, count|
        measurements << {
          value: count,
          attributes: { 'state' => state }
        }
      end
    end

    measurements
  end
end

# Usage example
db_metrics = DatabasePoolMetrics.new

# Simulate database operations
db_metrics.execute_query('SELECT', 'users')
db_metrics.execute_query('INSERT', 'orders')
db_metrics.execute_query('UPDATE', 'products')

Business Metrics

ruby
class BusinessMetrics
  """Business-specific metrics collection"""

  def initialize
    @meter = OpenTelemetry.meter_provider.meter('business', '1.0.0')

    @user_registrations = @meter.create_counter(
      'user_registrations_total',
      description: 'Total user registrations',
      unit: '1'
    )

    @order_value = @meter.create_histogram(
      'order_value_usd',
      description: 'Order value in USD',
      unit: 'USD'
    )

    # Simulated business data
    @subscription_counts = {
      'basic' => 1250,
      'premium' => 340,
      'enterprise' => 45
    }

    @revenue_data = {
      'monthly_recurring' => 50000,
      'one_time' => 15000,
      'total' => 65000
    }

    @subscription_gauge = @meter.create_observable_up_down_counter(
      'subscriptions_active',
      description: 'Active subscriptions by plan',
      unit: '1',
      callback: method(:collect_subscription_metrics)
    )

    @revenue_gauge = @meter.create_observable_gauge(
      'revenue_metrics',
      description: 'Revenue metrics',
      unit: 'USD',
      callback: method(:collect_revenue_metrics)
    )

    @mutex = Mutex.new
  end

  def record_user_registration(source, plan)
    """Record a new user registration"""
    @user_registrations.add(1, attributes: {
      'source' => source,
      'plan' => plan,
      'hour' => Time.now.hour.to_s  # Hour bucket for analysis
    })

    # Update subscription count
    @mutex.synchronize do
      @subscription_counts[plan] += 1
    end
  end

  def record_order(value, currency, category)
    """Record an order"""
    # Convert to USD for consistent reporting
    usd_value = convert_to_usd(value, currency)

    @order_value.record(usd_value, attributes: {
      'category' => category,
      'currency' => currency,
      'value_range' => get_value_range(usd_value)
    })

    # Update revenue
    @mutex.synchronize do
      @revenue_data['one_time'] += usd_value
      @revenue_data['total'] += usd_value
    end
  end

  private

  def collect_subscription_metrics
    measurements = []

    @mutex.synchronize do
      @subscription_counts.each do |plan, count|
        measurements << {
          value: count,
          attributes: { 'plan' => plan }
        }
      end
    end

    measurements
  end

  def collect_revenue_metrics
    measurements = []

    @mutex.synchronize do
      @revenue_data.each do |revenue_type, amount|
        measurements << {
          value: amount,
          attributes: { 'type' => revenue_type }
        }
      end
    end

    measurements
  end

  def convert_to_usd(value, currency)
    """Convert currency to USD (simplified)"""
    rates = { 'USD' => 1.0, 'EUR' => 1.1, 'GBP' => 1.3, 'JPY' => 0.007 }
    value * rates.fetch(currency, 1.0)
  end

  def get_value_range(value)
    """Categorize order value"""
    case value
    when 0...10
      'small'
    when 10...100
      'medium'
    when 100...1000
      'large'
    else
      'premium'
    end
  end
end

# Usage example
business_metrics = BusinessMetrics.new

# Record business events
business_metrics.record_user_registration('google_ads', 'premium')
business_metrics.record_order(99.99, 'USD', 'electronics')
business_metrics.record_order(49.99, 'EUR', 'books')
business_metrics.record_order(199.99, 'USD', 'clothing')

Configuration and Performance

Metric Reader Configuration

ruby
require 'opentelemetry/sdk'
require 'opentelemetry-metrics-sdk'
require 'opentelemetry-exporter-otlp-metrics'

# Configure metric reader with custom intervals
exporter = OpenTelemetry::Exporter::OTLP::Metrics::MetricsExporter.new(
  endpoint: 'https://api.uptrace.dev/v1/metrics',
  headers: { 'uptrace-dsn' => ENV['UPTRACE_DSN'] },
  compression: 'gzip',
  timeout: 30
)

# Create reader with custom export interval
reader = OpenTelemetry::SDK::Metrics::Export::PeriodicMetricReader.new(
  exporter: exporter,
  export_interval_millis: 60000,  # Export every 60 seconds
  export_timeout_millis: 30000    # 30 second timeout
)

# Create meter provider with reader
provider = OpenTelemetry::SDK::Metrics::MeterProvider.new(
  metric_readers: [reader]
)
OpenTelemetry.meter_provider = provider

# Multiple readers for different destinations
console_reader = OpenTelemetry::SDK::Metrics::Export::PeriodicMetricReader.new(
  OpenTelemetry::SDK::Metrics::Export::ConsoleMetricExporter.new,
  export_interval_millis: 10000  # More frequent for debugging
)

provider = OpenTelemetry::SDK::Metrics::MeterProvider.new(
  metric_readers: [reader, console_reader]
)

Memory Management

For long-running Ruby applications, consider memory usage:

ruby
class MemoryAwareMetrics
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('memory_aware', '1.0.0')
    @memory_gauge = @meter.create_observable_gauge(
      'ruby_memory_usage',
      description: 'Ruby memory usage metrics',
      unit: 'by',
      callback: method(:collect_memory_metrics)
    )

    # Periodic cleanup
    start_cleanup_thread
  end

  private

  def collect_memory_metrics
    measurements = []

    # Ruby VM memory stats
    if GC.respond_to?(:stat)
      stat = GC.stat
      measurements << {
        value: stat[:heap_allocated_pages] * 16384,  # Approximate heap size
        attributes: { 'type' => 'heap_allocated' }
      }
      measurements << {
        value: stat[:heap_free_pages] * 16384,
        attributes: { 'type' => 'heap_free' }
      }
    end

    # Process memory (if available)
    begin
      if File.exist?("/proc/#{Process.pid}/status")
        status = File.read("/proc/#{Process.pid}/status")
        if match = status.match(/VmRSS:\s+(\d+)\s+kB/)
          rss_kb = match[1].to_i
          measurements << {
            value: rss_kb * 1024,
            attributes: { 'type' => 'rss' }
          }
        end
      end
    rescue StandardError
      # Ignore if not available
    end

    measurements
  end

  def start_cleanup_thread
    Thread.new do
      loop do
        sleep(300)  # Every 5 minutes

        # Force garbage collection if memory usage is high
        stat = GC.respond_to?(:stat) ? GC.stat : {}
        heap_pages = stat[:heap_allocated_pages] || 0

        if heap_pages > 10000  # Threshold
          GC.start
          puts "Performed GC cleanup: heap pages = #{heap_pages}"
        end
      end
    end
  end
end

# Usage
memory_metrics = MemoryAwareMetrics.new

Attribute Optimization

Optimize attribute usage to prevent cardinality explosion:

ruby
class OptimizedMetrics
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('optimized', '1.0.0')
    @http_requests = @meter.create_counter('http_requests_total')
  end

  def record_request(method, status_code, endpoint)
    """Record HTTP request with optimized attributes"""
    # Use status classes instead of exact codes
    status_class = "#{status_code / 100}xx"

    # Categorize endpoints to reduce cardinality
    endpoint_category = categorize_endpoint(endpoint)

    @http_requests.add(1, attributes: {
      'method' => method,                    # ~10 possible values
      'status_class' => status_class,        # 5 possible values (2xx, 3xx, 4xx, 5xx)
      'endpoint_category' => endpoint_category # ~5 categories
    })
    # Total cardinality: 10 × 5 × 5 = 250 series
  end

  private

  def categorize_endpoint(endpoint)
    """Categorize endpoints to reduce cardinality"""
    case endpoint
    when %r{^/api/}
      'api'
    when %r{^/static/}
      'static'
    when '/health'
      'health'
    when %r{^/admin/}
      'admin'
    else
      'other'
    end
  end
end

# Avoid: High cardinality example
class BadMetrics
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('bad_example', '1.0.0')
    @http_requests = @meter.create_counter('http_requests_total')
  end

  def bad_example(method, status_code, full_url, user_id, timestamp)
    """Example of what NOT to do"""
    # DON'T DO THIS - creates millions of metric series
    @http_requests.add(1, attributes: {
      'method' => method,
      'status_code' => status_code.to_s,    # 50+ possible values
      'full_url' => full_url,               # Thousands of unique URLs
      'user_id' => user_id,                 # Thousands of users
      'timestamp' => timestamp.to_s         # Infinite unique values
    })
    # This could create millions of unique metric series!
  end
end

Environment Variables

Configure metrics behavior using environment variables:

bash
# Metric export settings
export OTEL_METRICS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=https://api.uptrace.dev/v1/metrics
export OTEL_EXPORTER_OTLP_METRICS_HEADERS="uptrace-dsn=YOUR_DSN"

# Collection interval (milliseconds)
export OTEL_METRIC_EXPORT_INTERVAL=60000

# Resource attributes
export OTEL_RESOURCE_ATTRIBUTES="service.name=my-service,service.version=1.0.0"

Use environment variables in your Ruby application:

ruby
require 'opentelemetry/sdk'
require 'opentelemetry-metrics-sdk'
require 'opentelemetry-exporter-otlp-metrics'

def setup_metrics
  """Setup metrics using environment variables"""
  # Get configuration from environment
  endpoint = ENV['OTEL_EXPORTER_OTLP_METRICS_ENDPOINT']
  headers_str = ENV['OTEL_EXPORTER_OTLP_METRICS_HEADERS'] || ''
  export_interval = ENV['OTEL_METRIC_EXPORT_INTERVAL']&.to_i || 60000

  # Parse headers
  headers = {}
  headers_str.split(',').each do |header|
    if header.include?('=')
      key, value = header.split('=', 2)
      headers[key.strip] = value.strip
    end
  end

  # Create exporter
  exporter = OpenTelemetry::Exporter::OTLP::Metrics::MetricsExporter.new(
    endpoint: endpoint,
    headers: headers
  )

  # Create reader
  reader = OpenTelemetry::SDK::Metrics::Export::PeriodicMetricReader.new(
    exporter: exporter,
    export_interval_millis: export_interval
  )

  # Create and set meter provider
  provider = OpenTelemetry::SDK::Metrics::MeterProvider.new(
    metric_readers: [reader]
  )
  OpenTelemetry.meter_provider = provider

  provider
end

# Usage
if __FILE__ == $0
  setup_metrics
  meter = OpenTelemetry.meter_provider.meter('my_app', '1.0.0')

  # Your metrics code here
end

Best Practices

Instrument Naming

Follow OpenTelemetry naming conventions:

ruby
# Good: Descriptive, hierarchical names
meter.create_counter('http.requests.total')
meter.create_histogram('http.request.duration')
meter.create_observable_gauge('system.memory.usage')

# Avoid: Generic or unclear names
meter.create_counter('requests')
meter.create_histogram('time')
meter.create_observable_gauge('memory')

Unit Specification

Always specify appropriate units:

ruby
meter.create_histogram('request.duration', unit: 's')          # seconds
meter.create_observable_gauge('memory.usage', unit: 'By')      # bytes
meter.create_counter('requests.total', unit: '1')              # dimensionless
meter.create_histogram('file.size', unit: 'By')                # bytes
meter.create_observable_gauge('temperature', unit: 'Cel')      # Celsius

Error Handling

Handle metric recording errors gracefully:

ruby
require 'logger'

class SafeMetrics
  def initialize
    @logger = Logger.new(STDOUT)
    @meter = OpenTelemetry.meter_provider.meter('safe_metrics', '1.0.0')
    @counter = @meter.create_counter('safe_requests_total')
  end

  def safe_record_metric(value, attributes)
    """Safely record metric with error handling"""
    begin
      @counter.add(value, attributes: attributes)
    rescue StandardError => e
      # Log the error but don't let metrics break your application
      @logger.error("Failed to record metric: #{e.message}")
    end
  end
end

# Usage
safe_metrics = SafeMetrics.new
safe_metrics.safe_record_metric(1, { 'status' => 'success' })

Testing Metrics

Create helper methods for testing:

ruby
require 'minitest/autorun'
require 'opentelemetry-metrics-sdk'

class MetricsTestCase < Minitest::Test
  def setup
    """Set up test environment"""
    @reader = OpenTelemetry::SDK::Metrics::Export::InMemoryMetricReader.new
    @provider = OpenTelemetry::SDK::Metrics::MeterProvider.new(
      metric_readers: [@reader]
    )
    @meter = @provider.meter('test_meter', '1.0.0')
  end

  def get_metrics
    """Get collected metrics"""
    @reader.pull
  end

  def test_counter
    """Test counter functionality"""
    counter = @meter.create_counter('test_counter')
    counter.add(1, attributes: { 'key' => 'value' })
    counter.add(2, attributes: { 'key' => 'value' })

    metrics = get_metrics
    # Assert metrics were recorded correctly
    refute_empty(metrics)
  end

  def teardown
    """Clean up after test"""
    @provider.shutdown
  end
end

Performance Monitoring

Monitor the performance impact of metrics collection:

ruby
class PerformanceAwareMetrics
  def initialize
    @meter = OpenTelemetry.meter_provider.meter('performance_aware', '1.0.0')
    @business_counter = @meter.create_counter('business_operations_total')
    @metrics_duration = @meter.create_histogram('metrics_collection_duration_seconds')
  end

  def record_business_metric(operation_type)
    """Record business metric with performance monitoring"""
    start_time = Time.now

    begin
      @business_counter.add(1, attributes: { 'type' => operation_type })
    ensure
      duration = Time.now - start_time

      # Only log if metric operation takes too long
      if duration > 0.001  # 1ms threshold
        puts "Metric operation took #{duration * 1000:.2f}ms"
      end

      @metrics_duration.record(duration, attributes: { 'operation' => 'counter_add' })
    end
  end
end

# Usage
perf_metrics = PerformanceAwareMetrics.new
perf_metrics.record_business_metric('important_operation')

OpenTelemetry APM

Uptrace is a DataDog alternative that supports distributed tracing, metrics, and logs. You can use it to monitor applications and troubleshoot issues.

Uptrace comes with an intuitive query builder, rich dashboards, alerting rules with notifications, and integrations for most languages and frameworks.

Uptrace can process billions of spans and metrics on a single server and allows you to monitor your applications at 10x lower cost.

In just a few minutes, you can try Uptrace by visiting the cloud demo (no login required) or running it locally with Docker. The source code is available on GitHub.

What's Next?

Now that you understand the OpenTelemetry Ruby Metrics API, explore these related topics: