Uptrace: Querying Metrics
TIP
To learn about metrics, see OpenTelemetry Metrics documentation.
Uptrace provides a powerful query language that supports joining, grouping, and aggregating multiple metrics in a single query.
Timeseries
A timeseries is a metric with an unique set of attributes, for example, each host has a separate timeseries for the same metric name:
# metric_name{ attr1, attr2... }
system.filesystem.usage{host.name='host1'} # timeseries 1
system.filesystem.usage{host.name='host2'} # timeseries 2
You can also use attributes to create more detailed and rich timeseries, for example, you can use state
attribute to report the number of free and used bytes in a filesystem:
system.filesystem.usage{host.name='host1', state='free'} # timeseries 1
system.filesystem.usage{host.name='host1', state='used'} # timeseries 2
system.filesystem.usage{host.name='host2', state='free'} # timeseries 3
system.filesystem.usage{host.name='host2', state='used'} # timeseries 4
With just 2 attributes, you can write a number of useful queries:
# the filesystem size (free+used bytes) on each host
query:
- $fs_usage group by host.name
# the number of free bytes on each host
query:
- $fs_usage{state='free'} as free group by host.name
# fs utilization on each host
query:
- $fs_usage{state='used'} / $fs_usage as fs_util group by host.name
# the size of your dataset on all hosts
query:
- $fs_usage{state='used'} as dataset_size
Writing queries
You start creating a query by selecting metric names and giving them a short alias, for example:
metrics:
# metric aliases always start with a dollar sign
- system.filesystem.usage as $fs_usage
- system.network.packets as $packets
Because Uptrace supports multiple metrics in the same query, you must use the alias to reference the metric and metric attributes, for example:
query:
# unlike metric aliases, column aliases don't start with a dollar
- $fs_usage as disk_size
- $fs_usage{state="used"} as used_space
# disk size on the specified device
- $fs_usage{host.name='host1', device='/dev/sdd1'} as host1_sdd1
# number of packets on each host.name
- per_min($packets) as packets_per_min group by host.name
You can use multiple metrics to construct arithmetic expressions, for example, you can write a query to plot the number of hits, misses, and calculate the hit rate:
metrics:
- api.user_cache.hits as $hits
- api.user_cache.misses as $misses
query:
- per_min($hits) as hits
- per_min($misses) as misses
- hits / (hits + misses) as hit_rate
Instruments
OpenTelemetry provides different instruments and each instrument supports different set of aggregations (functions):
Otel Instrument Name | Instrument | Aggregations |
---|---|---|
Counter, CounterObserver | counter | per_min, per_sec |
UpDownCounter, UpDownCounterObserver | additive | sum of last values, min, max |
GaugeObserver | gauge | last value, min, max |
Histogram | histogram | percentiles, min, max, per_min, per_sec, count, avg |
Dashboards
Uptrace supports 2 types of dashboards:
- A grid-based dashboard looks like a classical grid of charts.
- A table-based dashboard is a table of items where each item leads to a separate grid-based dashboard for the item, for example, a table of hostnames with some metrics for each hostname.
In other words, table-based dashboards allow to parameterize grid-based dashboards with attributes from the table. For example, Uptrace uses a table-based dashboard to monitor number of sampled and dropped spans for each project:
metrics:
- uptrace.projects.spans as $spans
query:
- $spans{type='spans'} as sampled_spans
- $spans{type='dropped'} as dropped_spans
- group by project_id
project_id | sampled_spans | dropped_spans | Link to a grid-based dashboard |
---|---|---|---|
1 | 100 | 0 | Dash with where project_id = 1 |
2 | 110 | 0 | Dash with where project_id = 2 |
... | ... | ... | ... |
999 | 90 | 0 | Dash with where project_id = 999 |
You can also create a grid-based dashboard without using a table-based dashboard as an entry point. Such dashboards are useful when you have many different items or groups, for example, you can create a separate dashboard for each database cluster or an availability zone.
Binary operator precedence
The following list shows the precedence of binary operators in Uptrace, from highest to lowest.
^
*
,/
,%
+
,-
==
,!=
,<=
,<
,>=
,>
and
,unless
or
Operators on the same precedence level are left-associative. For example, 2 * 3 % 2
is equivalent to (2 * 3) % 2
.