Uptrace: Querying Metrics

TIP

To learn about metrics, see OpenTelemetry Metricsopen in new window documentation.

Uptrace provides a powerful query language that supports joining, grouping, and aggregating multiple metrics in a single query.

Timeseries

A timeseries is a metric with an unique set of attributes, for example, each host has a separate timeseries for the same metric name:

# metric_name{ attr1, attr2... }
system.filesystem.usage{host.name='host1'} # timeseries 1
system.filesystem.usage{host.name='host2'} # timeseries 2

You can also use attributes to create more detailed and rich timeseries, for example, you can use state attribute to report the number of free and used bytes in a filesystem:

system.filesystem.usage{host.name='host1', state='free'} # timeseries 1
system.filesystem.usage{host.name='host1', state='used'} # timeseries 2

system.filesystem.usage{host.name='host2', state='free'} # timeseries 3
system.filesystem.usage{host.name='host2', state='used'} # timeseries 4

With just 2 attributes, you can write a number of useful queries:

# the filesystem size (free+used bytes) on each host
query:
  - $fs_usage group by host.name

# the number of free bytes on each host
query:
  - $fs_usage{state='free'} as free group by host.name

# fs utilization on each host
query:
  - $fs_usage{state='used'} / $fs_usage as fs_util group by host.name

# the size of your dataset on all hosts
query:
  - $fs_usage{state='used'} as dataset_size

Writing queries

You start creating a query by selecting metric names and giving them a short alias, for example:

metrics:
  # metric aliases always start with a dollar sign
  - system.filesystem.usage as $fs_usage
  - system.network.packets as $packets

Because Uptrace supports multiple metrics in the same query, you must use the alias to reference the metric and metric attributes, for example:

query:
  # unlike metric aliases, column aliases don't start with a dollar
  - $fs_usage as disk_size
  - $fs_usage{state="used"} as used_space

  # disk size on the specified device
  - $fs_usage{host.name='host1', device='/dev/sdd1'} as host1_sdd1

  # number of packets on each host.name
  - per_min($packets) as packets_per_min group by host.name

You can use multiple metrics to construct arithmetic expressions, for example, you can write a query to plot the number of hits, misses, and calculate the hit rate:

metrics:
  - api.user_cache.hits as $hits
  - api.user_cache.misses as $misses
query:
  - per_min($hits) as hits
  - per_min($misses) as misses
  - hits / (hits + misses) as hit_rate

Instruments

OpenTelemetry provides different instrumentsopen in new window and each instrument supports different set of aggregations (functions):

Otel Instrument NameInstrumentAggregations
Counteropen in new window, CounterObserveropen in new windowcounterper_min, per_sec
UpDownCounteropen in new window, UpDownCounterObserveropen in new windowadditivesum of last values, min, max
GaugeObserveropen in new windowgaugelast value, min, max
Histogramopen in new windowhistogrampercentiles, min, max, per_min, per_sec, count, avg

Dashboards

Uptrace supports 2 types of dashboards:

  • A grid-based dashboard looks like a classical grid of charts.
  • A table-based dashboard is a table of items where each item leads to a separate grid-based dashboard for the item, for example, a table of hostnames with some metrics for each hostname.

In other words, table-based dashboards allow to parameterize grid-based dashboards with attributes from the table. For example, Uptrace uses a table-based dashboard to monitor number of sampled and dropped spans for each project:

metrics:
  - uptrace.projects.spans as $spans
query:
  - $spans{type='spans'} as sampled_spans
  - $spans{type='dropped'} as dropped_spans
  - group by project_id
project_idsampled_spansdropped_spansLink to a grid-based dashboard
11000Dash with where project_id = 1
21100Dash with where project_id = 2
............
999900Dash with where project_id = 999

You can also create a grid-based dashboard without using a table-based dashboard as an entry point. Such dashboards are useful when you have many different items or groups, for example, you can create a separate dashboard for each database cluster or an availability zone.

Binary operator precedence

The following list shows the precedence of binary operators in Uptrace, from highest to lowest.

  • ^
  • *, /, %
  • +, -
  • ==, !=, <=, <, >=, >
  • and, unless
  • or

Operators on the same precedence level are left-associative. For example, 2 * 3 % 2 is equivalent to (2 * 3) % 2.

See also

Last Updated: