OpenTelemetry Ruby Metrics API
This document teaches you how to use the OpenTelemetry Ruby Metrics API to measure application performance with metrics. To learn how to install and configure the OpenTelemetry Ruby SDK, see Getting started with OpenTelemetry Ruby.
If you are not familiar with metrics terminology such as timeseries or additive/synchronous/asynchronous instruments, read the introduction to OpenTelemetry Metrics first.
The OpenTelemetry Ruby Metrics API is currently in experimental status. The API may change in future versions, but the core concepts remain stable.
Prerequisites
Before using the Metrics API, ensure you have the required packages installed:
gem install opentelemetry-api opentelemetry-sdk
gem install opentelemetry-metrics-api opentelemetry-metrics-sdk
Or add to your Gemfile
:
gem 'opentelemetry-api'
gem 'opentelemetry-sdk'
gem 'opentelemetry-metrics-api'
gem 'opentelemetry-metrics-sdk'
Getting Started
To get started with metrics, you need to create a meter:
require 'opentelemetry/metrics'
meter = OpenTelemetry.meter_provider.meter('my_app_or_gem', '1.0.0')
Using the meter, you can create instruments to measure performance. The simplest Counter instrument looks like this:
counter = meter.create_counter(
'requests_total',
description: 'Total number of requests processed',
unit: '1'
)
1000.times do |i|
counter.add(1, attributes: { 'status' => 'success', 'method' => 'GET' })
if i % 10 == 0
# Force collection for demonstration
sleep(0.1)
end
end
Metric Instruments
OpenTelemetry provides several types of instruments to capture different kinds of measurements. Each instrument serves a specific purpose and has distinct characteristics.
Counter
Counter is a synchronous instrument that measures additive non-decreasing values, representing cumulative totals like the number of requests, errors, or completed tasks.
require 'opentelemetry/metrics'
meter = OpenTelemetry.meter_provider.meter('my_app_or_gem', '1.0.0')
http_requests_counter = meter.create_counter(
'http_requests_total',
description: 'Total number of HTTP requests',
unit: '1'
)
error_counter = meter.create_counter(
'http_errors_total',
description: 'Total number of HTTP errors',
unit: '1'
)
def handle_request(method, endpoint, status_code)
# Record successful request
http_requests_counter.add(1, attributes: {
'method' => method,
'endpoint' => endpoint,
'status_code' => status_code.to_s
})
# Record error if applicable
if status_code >= 400
error_type = status_code < 500 ? 'client_error' : 'server_error'
error_counter.add(1, attributes: {
'method' => method,
'endpoint' => endpoint,
'error_type' => error_type
})
end
end
# Example usage
handle_request('GET', '/api/users', 200)
handle_request('POST', '/api/users', 201)
handle_request('GET', '/api/users/999', 404)
UpDownCounter
UpDownCounter is a synchronous instrument that measures additive values that can both increase and decrease, such as the number of active connections or items in a queue.
active_connections = meter.create_up_down_counter(
'database_connections_active',
description: 'Number of active database connections',
unit: '1'
)
queue_size = meter.create_up_down_counter(
'task_queue_size',
description: 'Number of items in the task queue',
unit: '1'
)
class ConnectionPool
def initialize
@meter = OpenTelemetry.meter_provider.meter('connection_pool', '1.0.0')
@active_connections = @meter.create_up_down_counter(
'connections_active',
description: 'Active database connections',
unit: '1'
)
end
def acquire_connection(database_name)
# Connection established
@active_connections.add(1, attributes: {
'database' => database_name,
'pool' => 'main'
})
begin
yield
ensure
# Connection released
@active_connections.add(-1, attributes: {
'database' => database_name,
'pool' => 'main'
})
end
end
end
class TaskQueue
def initialize
@meter = OpenTelemetry.meter_provider.meter('task_queue', '1.0.0')
@queue_size = @meter.create_up_down_counter(
'queue_size',
description: 'Number of items in queue',
unit: '1'
)
end
def enqueue(task, priority: 'normal')
# Add task to queue
perform_enqueue(task)
@queue_size.add(1, attributes: {
'queue' => 'tasks',
'priority' => priority
})
end
def dequeue(priority: 'normal')
task = perform_dequeue
if task
@queue_size.add(-1, attributes: {
'queue' => 'tasks',
'priority' => priority
})
end
task
end
private
def perform_enqueue(task)
# Queue implementation
end
def perform_dequeue
# Queue implementation
end
end
# Usage
pool = ConnectionPool.new
pool.acquire_connection('users_db') do
# Database operations
end
queue = TaskQueue.new
queue.enqueue(task, priority: 'high')
task = queue.dequeue(priority: 'high')
Histogram
Histogram is a synchronous instrument that measures the statistical distribution of values, such as request latencies or response sizes, grouping them into buckets.
request_duration = meter.create_histogram(
'http_request_duration_seconds',
description: 'HTTP request duration in seconds',
unit: 's'
)
response_size = meter.create_histogram(
'http_response_size_bytes',
description: 'HTTP response size in bytes',
unit: 'by'
)
class HttpHandler
def initialize
@meter = OpenTelemetry.meter_provider.meter('http_handler', '1.0.0')
@request_duration = @meter.create_histogram(
'request_duration_seconds',
description: 'Request duration',
unit: 's'
)
@response_size = @meter.create_histogram(
'response_size_bytes',
description: 'Response size',
unit: 'by'
)
end
def handle_request(method, endpoint)
start_time = Time.now
begin
# Simulate request processing
processing_time = rand(0.01..0.5)
sleep(processing_time)
# Simulate response
response_data = 'x' * rand(100..5000)
status_code = 200
# Record metrics
duration = Time.now - start_time
@request_duration.record(duration, attributes: {
'method' => method,
'endpoint' => endpoint,
'status_code' => status_code.to_s
})
@response_size.record(response_data.length, attributes: {
'method' => method,
'endpoint' => endpoint,
'content_type' => 'application/json'
})
{ status: status_code, body: response_data }
rescue StandardError => e
duration = Time.now - start_time
@request_duration.record(duration, attributes: {
'method' => method,
'endpoint' => endpoint,
'status_code' => '500'
})
raise
end
end
end
# Usage
handler = HttpHandler.new
response = handler.handle_request('GET', '/api/users')
response = handler.handle_request('POST', '/api/users')
Observable Gauge
Observable Gauge is an asynchronous instrument that measures non-additive values at a point in time, such as CPU usage, memory consumption, or temperature readings.
class SystemMetrics
def initialize
@meter = OpenTelemetry.meter_provider.meter('system_metrics', '1.0.0')
@system_gauge = @meter.create_observable_gauge(
'system_resource_usage',
description: 'System resource utilization',
unit: '1',
callback: method(:collect_system_metrics)
)
@app_gauge = @meter.create_observable_gauge(
'application_metrics',
description: 'Application-specific metrics',
unit: '1',
callback: method(:collect_app_metrics)
)
end
private
def collect_system_metrics
measurements = []
begin
# CPU usage (requires 'sys-cpu' gem or similar)
if defined?(Sys::CPU)
cpu_usage = Sys::CPU.load_avg.first
measurements << {
value: cpu_usage,
attributes: { 'resource' => 'cpu', 'unit' => 'percent' }
}
end
# Memory usage (requires 'sys-proctable' gem or similar)
if defined?(Sys::ProcTable)
process = Sys::ProcTable.ps(pid: Process.pid).first
if process
memory_mb = process.rss / 1024 / 1024
measurements << {
value: memory_mb,
attributes: { 'resource' => 'memory', 'unit' => 'megabytes' }
}
end
end
# Simple Ruby process info
measurements << {
value: Process.pid,
attributes: { 'resource' => 'process', 'unit' => 'pid' }
}
rescue StandardError => e
# Log error but don't fail metric collection
puts "Error collecting system metrics: #{e.message}"
end
measurements
end
def collect_app_metrics
measurements = []
begin
# Current timestamp
measurements << {
value: Time.now.to_f,
attributes: { 'metric' => 'last_update', 'unit' => 'timestamp' }
}
# Active threads
active_threads = Thread.list.count { |t| t.status == 'run' }
measurements << {
value: active_threads,
attributes: { 'metric' => 'active_threads', 'unit' => 'count' }
}
# Object count (Ruby-specific)
measurements << {
value: ObjectSpace.count_objects[:TOTAL],
attributes: { 'metric' => 'objects_total', 'unit' => 'count' }
}
rescue StandardError => e
puts "Error collecting app metrics: #{e.message}"
end
measurements
end
end
# Usage
system_metrics = SystemMetrics.new
# Metrics will be collected automatically by the SDK
Observable Counter
Observable Counter is an asynchronous instrument that measures monotonically increasing values, such as total bytes read or CPU time consumed.
class ProcessMetrics
def initialize
@meter = OpenTelemetry.meter_provider.meter('process_metrics', '1.0.0')
@process_counter = @meter.create_observable_counter(
'process_resource_usage',
description: 'Process resource usage counters',
unit: '1',
callback: method(:collect_process_metrics)
)
@io_counter = @meter.create_observable_counter(
'process_io_usage',
description: 'Process I/O usage counters',
unit: '1',
callback: method(:collect_io_metrics)
)
end
private
def collect_process_metrics
measurements = []
begin
# Process times
times = Process.times
measurements << {
value: times.utime,
attributes: { 'cpu_type' => 'user', 'unit' => 'seconds' }
}
measurements << {
value: times.stime,
attributes: { 'cpu_type' => 'system', 'unit' => 'seconds' }
}
# Ruby VM stats
if GC.respond_to?(:stat)
gc_stats = GC.stat
measurements << {
value: gc_stats[:count] || 0,
attributes: { 'resource' => 'gc_runs', 'unit' => 'count' }
}
measurements << {
value: gc_stats[:total_allocated_objects] || 0,
attributes: { 'resource' => 'allocated_objects', 'unit' => 'count' }
}
end
rescue StandardError => e
puts "Error collecting process metrics: #{e.message}"
end
measurements
end
def collect_io_metrics
measurements = []
begin
# File descriptors (Unix-like systems only)
if RUBY_PLATFORM !~ /mswin|mingw|cygwin/
begin
fd_count = Dir.glob("/proc/#{Process.pid}/fd/*").length
measurements << {
value: fd_count,
attributes: { 'resource' => 'file_descriptors', 'unit' => 'count' }
}
rescue Errno::ENOENT, Errno::EACCES
# /proc not available or accessible
end
end
# Ruby-specific: loaded features count
measurements << {
value: $LOADED_FEATURES.length,
attributes: { 'resource' => 'loaded_features', 'unit' => 'count' }
}
rescue StandardError => e
puts "Error collecting I/O metrics: #{e.message}"
end
measurements
end
end
# Usage
process_metrics = ProcessMetrics.new
# Metrics will be collected automatically
Observable UpDownCounter
Observable UpDownCounter is an asynchronous instrument that measures additive values that can increase or decrease, measured at observation time.
class QueueMetrics
def initialize
@meter = OpenTelemetry.meter_provider.meter('queue_metrics', '1.0.0')
# Simulated queue state
@message_queues = {
'email' => { size: 0, workers: 0 },
'sms' => { size: 0, workers: 0 },
'push' => { size: 0, workers: 0 }
}
@queue_gauge = @meter.create_observable_up_down_counter(
'message_queue_status',
description: 'Message queue status metrics',
unit: '1',
callback: method(:collect_queue_metrics)
)
@connection_gauge = @meter.create_observable_up_down_counter(
'connection_pool_status',
description: 'Connection pool status',
unit: '1',
callback: method(:collect_connection_metrics)
)
end
def update_queue_size(queue_name, change)
@message_queues[queue_name][:size] += change if @message_queues[queue_name]
end
def update_workers(queue_name, change)
@message_queues[queue_name][:workers] += change if @message_queues[queue_name]
end
private
def collect_queue_metrics
measurements = []
@message_queues.each do |queue_name, stats|
# Queue size (can go up and down)
measurements << {
value: stats[:size],
attributes: { 'queue' => queue_name, 'metric' => 'size' }
}
# Active workers (can go up and down)
measurements << {
value: stats[:workers],
attributes: { 'queue' => queue_name, 'metric' => 'workers' }
}
end
measurements
end
def collect_connection_metrics
measurements = []
# Simulate connection pool status
pools = {
'database' => { 'active' => 5, 'idle' => 3, 'max' => 10 },
'redis' => { 'active' => 2, 'idle' => 8, 'max' => 10 },
'elasticsearch' => { 'active' => 1, 'idle' => 4, 'max' => 5 }
}
pools.each do |pool_name, stats|
stats.each do |state, count|
measurements << {
value: count,
attributes: { 'pool' => pool_name, 'state' => state }
}
end
end
measurements
end
end
# Usage
queue_metrics = QueueMetrics.new
# Simulate queue operations
queue_metrics.update_queue_size('email', 5)
queue_metrics.update_workers('email', 2)
# Metrics will be collected automatically by the SDK
Working with Attributes
Attributes provide contextual information that makes metrics more useful for analysis and filtering.
Adding Attributes to Measurements
# Create various counters and histograms
meter = OpenTelemetry.meter_provider.meter('api_service', '1.0.0')
api_requests = meter.create_counter('api_requests_total', description: 'Total API requests')
request_duration = meter.create_histogram('request_duration_seconds', description: 'Request duration')
class ApiHandler
def initialize
@meter = OpenTelemetry.meter_provider.meter('api_handler', '1.0.0')
@api_requests = @meter.create_counter('api_requests_total')
@request_duration = @meter.create_histogram('request_duration_seconds')
end
def handle_request(method, endpoint, user_type, region)
start_time = Time.now
begin
# Simulate request processing
processing_time = rand(0.01..0.3)
sleep(processing_time)
# Record successful request with detailed attributes
@api_requests.add(1, attributes: {
'method' => method,
'endpoint' => endpoint,
'status_code' => '200',
'user_type' => user_type,
'region' => region,
'cache_hit' => 'false'
})
# Record duration
duration = Time.now - start_time
@request_duration.record(duration, attributes: {
'method' => method,
'endpoint' => endpoint,
'status_code' => '200'
})
{ status: 200, message: 'Success' }
rescue StandardError => e
# Record error
@api_requests.add(1, attributes: {
'method' => method,
'endpoint' => endpoint,
'status_code' => '500',
'user_type' => user_type,
'region' => region,
'error_type' => e.class.name
})
duration = Time.now - start_time
@request_duration.record(duration, attributes: {
'method' => method,
'endpoint' => endpoint,
'status_code' => '500'
})
raise
end
end
end
# Example usage
handler = ApiHandler.new
handler.handle_request('GET', '/api/users', 'premium', 'us-east-1')
handler.handle_request('POST', '/api/orders', 'free', 'eu-west-1')
Attribute Best Practices
Use meaningful attributes that provide valuable differentiation without creating excessive cardinality:
# Good: Low cardinality attributes
http_requests = meter.create_counter('http_requests_total')
def record_request(method, status_code, endpoint_category)
"""Record request with low-cardinality attributes"""
http_requests.add(1, attributes: {
'method' => method, # Limited values: GET, POST, PUT, DELETE
'status_class' => "#{status_code / 100}xx", # Grouped: 2xx, 3xx, 4xx, 5xx
'endpoint_category' => endpoint_category # Grouped: api, static, health
})
end
def categorize_endpoint(endpoint)
"""Categorize endpoints to reduce cardinality"""
case endpoint
when %r{^/api/}
'api'
when %r{^/static/}
'static'
when '/health'
'health'
when %r{^/admin/}
'admin'
else
'other'
end
end
# Example usage
record_request('GET', 200, 'api')
record_request('POST', 201, 'api')
# Avoid: High cardinality attributes
def bad_example(method, status_code, full_url, user_id, timestamp)
"""Example of what NOT to do - high cardinality attributes"""
# DON'T DO THIS - creates too many unique metric series
http_requests.add(1, attributes: {
'method' => method,
'status_code' => status_code.to_s, # 50+ possible values
'full_url' => full_url, # Thousands of unique URLs
'user_id' => user_id, # Thousands of users
'timestamp' => timestamp.to_s # Infinite unique values
})
# This could create millions of unique metric series!
end
Recording Measurements
Synchronous Measurements
Synchronous instruments are recorded inline with application logic:
class OrderProcessor
def initialize
@meter = OpenTelemetry.meter_provider.meter('order_processor', '1.0.0')
@operation_counter = @meter.create_counter('operations_total')
@operation_duration = @meter.create_histogram('operation_duration_seconds')
@error_counter = @meter.create_counter('operation_errors_total')
end
def process_order(order_type, user_id)
start_time = Time.now
begin
# Increment operation counter
@operation_counter.add(1, attributes: {
'operation' => order_type,
'status' => 'started'
})
# Simulate operation
processing_time = rand(0.1..1.0)
sleep(processing_time)
# Simulate potential failure
if rand < 0.1 # 10% failure rate
raise StandardError, 'Order processing failed'
end
# Record successful completion
@operation_counter.add(1, attributes: {
'operation' => order_type,
'status' => 'completed'
})
"Order #{order_type} processed successfully"
rescue StandardError => e
# Record error
@error_counter.add(1, attributes: {
'operation' => order_type,
'error_type' => e.class.name,
'error_message' => e.message[0..50] # Truncate to avoid high cardinality
})
@operation_counter.add(1, attributes: {
'operation' => order_type,
'status' => 'failed'
})
raise
ensure
# Always record duration
duration = Time.now - start_time
@operation_duration.record(duration, attributes: {
'operation' => order_type
})
end
end
end
# Example usage
processor = OrderProcessor.new
begin
result = processor.process_order('premium_order', 'user123')
puts result
rescue StandardError => e
puts "Operation failed: #{e.message}"
end
Asynchronous Measurements
Asynchronous instruments use callbacks that are invoked during metric collection:
class AsyncMetricsCollector
def initialize
@meter = OpenTelemetry.meter_provider.meter('async_collector', '1.0.0')
# Global state for demonstration
@system_stats = {
'cpu_usage' => 0.0,
'memory_usage' => 0.0,
'disk_usage' => 0.0,
'network_connections' => 0
}
@queue_stats = {
'email' => { 'size' => 0, 'processed' => 0 },
'sms' => { 'size' => 0, 'processed' => 0 },
'push' => { 'size' => 0, 'processed' => 0 }
}
# Create observable instruments
@system_gauge = @meter.create_observable_gauge(
'system_metrics',
description: 'System performance metrics',
unit: '1',
callback: method(:collect_system_metrics)
)
@queue_counter = @meter.create_observable_up_down_counter(
'queue_metrics',
description: 'Queue status metrics',
unit: '1',
callback: method(:collect_queue_metrics)
)
end
def update_system_stats
# Simulate system stats updates (in real app, this would call actual system APIs)
@system_stats['cpu_usage'] = rand(0.0..100.0)
@system_stats['memory_usage'] = rand(20.0..80.0)
@system_stats['disk_usage'] = rand(10.0..90.0)
@system_stats['network_connections'] = rand(10..100)
end
def simulate_queue_activity
@queue_stats.each do |queue_name, stats|
# Add items to queue
new_items = rand(0..5)
stats['size'] += new_items
# Process items from queue
processed = [stats['size'], rand(0..3)].min
stats['size'] -= processed
stats['processed'] += processed
end
end
private
def collect_system_metrics
measurements = []
update_system_stats
@system_stats.each do |metric_name, value|
measurements << {
value: value,
attributes: { 'metric' => metric_name }
}
end
measurements
end
def collect_queue_metrics
measurements = []
simulate_queue_activity
@queue_stats.each do |queue_name, stats|
# Queue size (up/down counter)
measurements << {
value: stats['size'],
attributes: { 'queue' => queue_name, 'metric' => 'size' }
}
# Total processed (counter)
measurements << {
value: stats['processed'],
attributes: { 'queue' => queue_name, 'metric' => 'processed' }
}
end
measurements
end
end
# Usage
collector = AsyncMetricsCollector.new
# In a real application, you might run this in a background thread
# Thread.new do
# loop do
# sleep(30) # Metrics will be collected automatically by the SDK
# end
# end
Practical Examples
HTTP Server Metrics
class HttpServerMetrics
"""Comprehensive HTTP server metrics collection"""
def initialize
@meter = OpenTelemetry.meter_provider.meter('http_server', '1.0.0')
@request_counter = @meter.create_counter(
'http_requests_total',
description: 'Total HTTP requests',
unit: '1'
)
@request_duration = @meter.create_histogram(
'http_request_duration_seconds',
description: 'HTTP request duration',
unit: 's'
)
@active_requests = @meter.create_up_down_counter(
'http_requests_active',
description: 'Active HTTP requests',
unit: '1'
)
@response_size = @meter.create_histogram(
'http_response_size_bytes',
description: 'HTTP response size',
unit: 'by'
)
@error_counter = @meter.create_counter(
'http_errors_total',
description: 'Total HTTP errors',
unit: '1'
)
end
def record_request(method, route, status_code, duration, response_size)
"""Record metrics for an HTTP request"""
attributes = {
'method' => method,
'route' => route,
'status_code' => status_code.to_s
}
# Record request count
@request_counter.add(1, attributes: attributes)
# Record duration
@request_duration.record(duration, attributes: attributes)
# Record response size
@response_size.record(response_size, attributes: attributes)
# Record errors
if status_code >= 400
error_attributes = attributes.merge({
'error_type' => status_code < 500 ? 'client_error' : 'server_error'
})
@error_counter.add(1, attributes: error_attributes)
end
end
def start_request(method, route)
"""Track active request start"""
@active_requests.add(1, attributes: {
'method' => method,
'route' => route
})
end
def end_request(method, route)
"""Track active request end"""
@active_requests.add(-1, attributes: {
'method' => method,
'route' => route
})
end
end
# Rack middleware example
class MetricsMiddleware
def initialize(app)
@app = app
@metrics = HttpServerMetrics.new
end
def call(env)
request = Rack::Request.new(env)
method = request.request_method
route = extract_route(env)
start_time = Time.now
@metrics.start_request(method, route)
begin
status, headers, body = @app.call(env)
# Calculate response size
response_size = if body.respond_to?(:each)
body.sum { |chunk| chunk.bytesize }
else
body.to_s.bytesize
end
# Record metrics
duration = Time.now - start_time
@metrics.record_request(method, route, status.to_i, duration, response_size)
[status, headers, body]
ensure
@metrics.end_request(method, route)
end
end
private
def extract_route(env)
# Extract route pattern, fallback to path
env['HTTP_X_ROUTE_PATTERN'] || env['PATH_INFO'] || '/'
end
end
# Usage with Sinatra
require 'sinatra'
use MetricsMiddleware
get '/users/:id' do
# Your application logic
{ user_id: params[:id] }.to_json
end
Database Connection Pool Metrics
class DatabasePoolMetrics
"""Database connection pool metrics"""
def initialize
@meter = OpenTelemetry.meter_provider.meter('database_pool', '1.0.0')
@query_counter = @meter.create_counter(
'db_queries_total',
description: 'Total database queries',
unit: '1'
)
@query_duration = @meter.create_histogram(
'db_query_duration_seconds',
description: 'Database query duration',
unit: 's'
)
# Simulated connection pool state
@pool_stats = {
'active' => 0,
'idle' => 10,
'max' => 20,
'total_created' => 10,
'total_closed' => 0
}
@connection_pool_gauge = @meter.create_observable_gauge(
'db_connection_pool_status',
description: 'Database connection pool status',
unit: '1',
callback: method(:collect_pool_metrics)
)
@mutex = Mutex.new
end
def execute_query(query_type, table)
"""Execute a database query with metrics"""
start_time = Time.now
# Simulate getting connection from pool
@mutex.synchronize do
if @pool_stats['idle'] > 0
@pool_stats['idle'] -= 1
@pool_stats['active'] += 1
else
# Would normally wait for connection or create new one
sleep(0.01) # Simulate wait
end
end
begin
# Simulate query execution
execution_time = rand(0.001..0.1)
sleep(execution_time)
# Record metrics
duration = Time.now - start_time
attributes = {
'query_type' => query_type,
'table' => table
}
@query_counter.add(1, attributes: attributes)
@query_duration.record(duration, attributes: attributes)
"Query #{query_type} on #{table} completed"
ensure
# Return connection to pool
@mutex.synchronize do
@pool_stats['active'] -= 1
@pool_stats['idle'] += 1
end
end
end
private
def collect_pool_metrics
measurements = []
@mutex.synchronize do
@pool_stats.each do |state, count|
measurements << {
value: count,
attributes: { 'state' => state }
}
end
end
measurements
end
end
# Usage example
db_metrics = DatabasePoolMetrics.new
# Simulate database operations
db_metrics.execute_query('SELECT', 'users')
db_metrics.execute_query('INSERT', 'orders')
db_metrics.execute_query('UPDATE', 'products')
Business Metrics
class BusinessMetrics
"""Business-specific metrics collection"""
def initialize
@meter = OpenTelemetry.meter_provider.meter('business', '1.0.0')
@user_registrations = @meter.create_counter(
'user_registrations_total',
description: 'Total user registrations',
unit: '1'
)
@order_value = @meter.create_histogram(
'order_value_usd',
description: 'Order value in USD',
unit: 'USD'
)
# Simulated business data
@subscription_counts = {
'basic' => 1250,
'premium' => 340,
'enterprise' => 45
}
@revenue_data = {
'monthly_recurring' => 50000,
'one_time' => 15000,
'total' => 65000
}
@subscription_gauge = @meter.create_observable_up_down_counter(
'subscriptions_active',
description: 'Active subscriptions by plan',
unit: '1',
callback: method(:collect_subscription_metrics)
)
@revenue_gauge = @meter.create_observable_gauge(
'revenue_metrics',
description: 'Revenue metrics',
unit: 'USD',
callback: method(:collect_revenue_metrics)
)
@mutex = Mutex.new
end
def record_user_registration(source, plan)
"""Record a new user registration"""
@user_registrations.add(1, attributes: {
'source' => source,
'plan' => plan,
'hour' => Time.now.hour.to_s # Hour bucket for analysis
})
# Update subscription count
@mutex.synchronize do
@subscription_counts[plan] += 1
end
end
def record_order(value, currency, category)
"""Record an order"""
# Convert to USD for consistent reporting
usd_value = convert_to_usd(value, currency)
@order_value.record(usd_value, attributes: {
'category' => category,
'currency' => currency,
'value_range' => get_value_range(usd_value)
})
# Update revenue
@mutex.synchronize do
@revenue_data['one_time'] += usd_value
@revenue_data['total'] += usd_value
end
end
private
def collect_subscription_metrics
measurements = []
@mutex.synchronize do
@subscription_counts.each do |plan, count|
measurements << {
value: count,
attributes: { 'plan' => plan }
}
end
end
measurements
end
def collect_revenue_metrics
measurements = []
@mutex.synchronize do
@revenue_data.each do |revenue_type, amount|
measurements << {
value: amount,
attributes: { 'type' => revenue_type }
}
end
end
measurements
end
def convert_to_usd(value, currency)
"""Convert currency to USD (simplified)"""
rates = { 'USD' => 1.0, 'EUR' => 1.1, 'GBP' => 1.3, 'JPY' => 0.007 }
value * rates.fetch(currency, 1.0)
end
def get_value_range(value)
"""Categorize order value"""
case value
when 0...10
'small'
when 10...100
'medium'
when 100...1000
'large'
else
'premium'
end
end
end
# Usage example
business_metrics = BusinessMetrics.new
# Record business events
business_metrics.record_user_registration('google_ads', 'premium')
business_metrics.record_order(99.99, 'USD', 'electronics')
business_metrics.record_order(49.99, 'EUR', 'books')
business_metrics.record_order(199.99, 'USD', 'clothing')
Configuration and Performance
Metric Reader Configuration
require 'opentelemetry/sdk'
require 'opentelemetry-metrics-sdk'
require 'opentelemetry-exporter-otlp-metrics'
# Configure metric reader with custom intervals
exporter = OpenTelemetry::Exporter::OTLP::Metrics::MetricsExporter.new(
endpoint: 'https://api.uptrace.dev/v1/metrics',
headers: { 'uptrace-dsn' => ENV['UPTRACE_DSN'] },
compression: 'gzip',
timeout: 30
)
# Create reader with custom export interval
reader = OpenTelemetry::SDK::Metrics::Export::PeriodicMetricReader.new(
exporter: exporter,
export_interval_millis: 60000, # Export every 60 seconds
export_timeout_millis: 30000 # 30 second timeout
)
# Create meter provider with reader
provider = OpenTelemetry::SDK::Metrics::MeterProvider.new(
metric_readers: [reader]
)
OpenTelemetry.meter_provider = provider
# Multiple readers for different destinations
console_reader = OpenTelemetry::SDK::Metrics::Export::PeriodicMetricReader.new(
OpenTelemetry::SDK::Metrics::Export::ConsoleMetricExporter.new,
export_interval_millis: 10000 # More frequent for debugging
)
provider = OpenTelemetry::SDK::Metrics::MeterProvider.new(
metric_readers: [reader, console_reader]
)
Memory Management
For long-running Ruby applications, consider memory usage:
class MemoryAwareMetrics
def initialize
@meter = OpenTelemetry.meter_provider.meter('memory_aware', '1.0.0')
@memory_gauge = @meter.create_observable_gauge(
'ruby_memory_usage',
description: 'Ruby memory usage metrics',
unit: 'by',
callback: method(:collect_memory_metrics)
)
# Periodic cleanup
start_cleanup_thread
end
private
def collect_memory_metrics
measurements = []
# Ruby VM memory stats
if GC.respond_to?(:stat)
stat = GC.stat
measurements << {
value: stat[:heap_allocated_pages] * 16384, # Approximate heap size
attributes: { 'type' => 'heap_allocated' }
}
measurements << {
value: stat[:heap_free_pages] * 16384,
attributes: { 'type' => 'heap_free' }
}
end
# Process memory (if available)
begin
if File.exist?("/proc/#{Process.pid}/status")
status = File.read("/proc/#{Process.pid}/status")
if match = status.match(/VmRSS:\s+(\d+)\s+kB/)
rss_kb = match[1].to_i
measurements << {
value: rss_kb * 1024,
attributes: { 'type' => 'rss' }
}
end
end
rescue StandardError
# Ignore if not available
end
measurements
end
def start_cleanup_thread
Thread.new do
loop do
sleep(300) # Every 5 minutes
# Force garbage collection if memory usage is high
stat = GC.respond_to?(:stat) ? GC.stat : {}
heap_pages = stat[:heap_allocated_pages] || 0
if heap_pages > 10000 # Threshold
GC.start
puts "Performed GC cleanup: heap pages = #{heap_pages}"
end
end
end
end
end
# Usage
memory_metrics = MemoryAwareMetrics.new
Attribute Optimization
Optimize attribute usage to prevent cardinality explosion:
class OptimizedMetrics
def initialize
@meter = OpenTelemetry.meter_provider.meter('optimized', '1.0.0')
@http_requests = @meter.create_counter('http_requests_total')
end
def record_request(method, status_code, endpoint)
"""Record HTTP request with optimized attributes"""
# Use status classes instead of exact codes
status_class = "#{status_code / 100}xx"
# Categorize endpoints to reduce cardinality
endpoint_category = categorize_endpoint(endpoint)
@http_requests.add(1, attributes: {
'method' => method, # ~10 possible values
'status_class' => status_class, # 5 possible values (2xx, 3xx, 4xx, 5xx)
'endpoint_category' => endpoint_category # ~5 categories
})
# Total cardinality: 10 × 5 × 5 = 250 series
end
private
def categorize_endpoint(endpoint)
"""Categorize endpoints to reduce cardinality"""
case endpoint
when %r{^/api/}
'api'
when %r{^/static/}
'static'
when '/health'
'health'
when %r{^/admin/}
'admin'
else
'other'
end
end
end
# Avoid: High cardinality example
class BadMetrics
def initialize
@meter = OpenTelemetry.meter_provider.meter('bad_example', '1.0.0')
@http_requests = @meter.create_counter('http_requests_total')
end
def bad_example(method, status_code, full_url, user_id, timestamp)
"""Example of what NOT to do"""
# DON'T DO THIS - creates millions of metric series
@http_requests.add(1, attributes: {
'method' => method,
'status_code' => status_code.to_s, # 50+ possible values
'full_url' => full_url, # Thousands of unique URLs
'user_id' => user_id, # Thousands of users
'timestamp' => timestamp.to_s # Infinite unique values
})
# This could create millions of unique metric series!
end
end
Environment Variables
Configure metrics behavior using environment variables:
# Metric export settings
export OTEL_METRICS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=https://api.uptrace.dev/v1/metrics
export OTEL_EXPORTER_OTLP_METRICS_HEADERS="uptrace-dsn=YOUR_DSN"
# Collection interval (milliseconds)
export OTEL_METRIC_EXPORT_INTERVAL=60000
# Resource attributes
export OTEL_RESOURCE_ATTRIBUTES="service.name=my-service,service.version=1.0.0"
Use environment variables in your Ruby application:
require 'opentelemetry/sdk'
require 'opentelemetry-metrics-sdk'
require 'opentelemetry-exporter-otlp-metrics'
def setup_metrics
"""Setup metrics using environment variables"""
# Get configuration from environment
endpoint = ENV['OTEL_EXPORTER_OTLP_METRICS_ENDPOINT']
headers_str = ENV['OTEL_EXPORTER_OTLP_METRICS_HEADERS'] || ''
export_interval = ENV['OTEL_METRIC_EXPORT_INTERVAL']&.to_i || 60000
# Parse headers
headers = {}
headers_str.split(',').each do |header|
if header.include?('=')
key, value = header.split('=', 2)
headers[key.strip] = value.strip
end
end
# Create exporter
exporter = OpenTelemetry::Exporter::OTLP::Metrics::MetricsExporter.new(
endpoint: endpoint,
headers: headers
)
# Create reader
reader = OpenTelemetry::SDK::Metrics::Export::PeriodicMetricReader.new(
exporter: exporter,
export_interval_millis: export_interval
)
# Create and set meter provider
provider = OpenTelemetry::SDK::Metrics::MeterProvider.new(
metric_readers: [reader]
)
OpenTelemetry.meter_provider = provider
provider
end
# Usage
if __FILE__ == $0
setup_metrics
meter = OpenTelemetry.meter_provider.meter('my_app', '1.0.0')
# Your metrics code here
end
Best Practices
Instrument Naming
Follow OpenTelemetry naming conventions:
# Good: Descriptive, hierarchical names
meter.create_counter('http.requests.total')
meter.create_histogram('http.request.duration')
meter.create_observable_gauge('system.memory.usage')
# Avoid: Generic or unclear names
meter.create_counter('requests')
meter.create_histogram('time')
meter.create_observable_gauge('memory')
Unit Specification
Always specify appropriate units:
meter.create_histogram('request.duration', unit: 's') # seconds
meter.create_observable_gauge('memory.usage', unit: 'By') # bytes
meter.create_counter('requests.total', unit: '1') # dimensionless
meter.create_histogram('file.size', unit: 'By') # bytes
meter.create_observable_gauge('temperature', unit: 'Cel') # Celsius
Error Handling
Handle metric recording errors gracefully:
require 'logger'
class SafeMetrics
def initialize
@logger = Logger.new(STDOUT)
@meter = OpenTelemetry.meter_provider.meter('safe_metrics', '1.0.0')
@counter = @meter.create_counter('safe_requests_total')
end
def safe_record_metric(value, attributes)
"""Safely record metric with error handling"""
begin
@counter.add(value, attributes: attributes)
rescue StandardError => e
# Log the error but don't let metrics break your application
@logger.error("Failed to record metric: #{e.message}")
end
end
end
# Usage
safe_metrics = SafeMetrics.new
safe_metrics.safe_record_metric(1, { 'status' => 'success' })
Testing Metrics
Create helper methods for testing:
require 'minitest/autorun'
require 'opentelemetry-metrics-sdk'
class MetricsTestCase < Minitest::Test
def setup
"""Set up test environment"""
@reader = OpenTelemetry::SDK::Metrics::Export::InMemoryMetricReader.new
@provider = OpenTelemetry::SDK::Metrics::MeterProvider.new(
metric_readers: [@reader]
)
@meter = @provider.meter('test_meter', '1.0.0')
end
def get_metrics
"""Get collected metrics"""
@reader.pull
end
def test_counter
"""Test counter functionality"""
counter = @meter.create_counter('test_counter')
counter.add(1, attributes: { 'key' => 'value' })
counter.add(2, attributes: { 'key' => 'value' })
metrics = get_metrics
# Assert metrics were recorded correctly
refute_empty(metrics)
end
def teardown
"""Clean up after test"""
@provider.shutdown
end
end
Performance Monitoring
Monitor the performance impact of metrics collection:
class PerformanceAwareMetrics
def initialize
@meter = OpenTelemetry.meter_provider.meter('performance_aware', '1.0.0')
@business_counter = @meter.create_counter('business_operations_total')
@metrics_duration = @meter.create_histogram('metrics_collection_duration_seconds')
end
def record_business_metric(operation_type)
"""Record business metric with performance monitoring"""
start_time = Time.now
begin
@business_counter.add(1, attributes: { 'type' => operation_type })
ensure
duration = Time.now - start_time
# Only log if metric operation takes too long
if duration > 0.001 # 1ms threshold
puts "Metric operation took #{duration * 1000:.2f}ms"
end
@metrics_duration.record(duration, attributes: { 'operation' => 'counter_add' })
end
end
end
# Usage
perf_metrics = PerformanceAwareMetrics.new
perf_metrics.record_business_metric('important_operation')
OpenTelemetry APM
Uptrace is a DataDog alternative that supports distributed tracing, metrics, and logs. You can use it to monitor applications and troubleshoot issues.
Uptrace comes with an intuitive query builder, rich dashboards, alerting rules with notifications, and integrations for most languages and frameworks.
Uptrace can process billions of spans and metrics on a single server and allows you to monitor your applications at 10x lower cost.
In just a few minutes, you can try Uptrace by visiting the cloud demo (no login required) or running it locally with Docker. The source code is available on GitHub.
What's Next?
Now that you understand the OpenTelemetry Ruby Metrics API, explore these related topics: