Dashboard YAML Templates
Uptrace dashboards can be defined as YAML templates. Templates are used for built-in dashboards that ship with Uptrace and for importing/exporting user-created dashboards.
When Uptrace receives new metrics, it checks available templates and automatically creates matching dashboards.
Template structure
Every template starts with a schema: v2 header followed by metadata, then optional table, grid_rows, and metric_monitors sections.
schema: v2
name: 'Hostmetrics: Overview'
description: Tracks host CPU, memory, and system load.
tags: [otel, infra]
version: v25.04.20
setup_link: https://example.com/setup
doc_link: https://example.com/docs
# Optional: minimum query interval
min_interval: 1m
# Optional: shift the query time window
time_offset: 5m
# Optional: only apply when these metrics exist
require_metrics:
- metric: system_cpu_utilization
- library: io.opentelemetry.instrumentation.system
metric: system_cpu_time
table_grid_items:
# ... summary widgets above the table
table:
# ... table dashboard definition
grid_rows:
# ... grid dashboard rows
metric_monitors:
# ... bundled alert monitors
Top-level fields
| Field | Type | Required | Description |
|---|---|---|---|
schema | string | yes | Must be "v2" |
name | string | yes | Human-readable dashboard title |
description | string | no | Summary of the dashboard's purpose |
tags | list | yes | Categorization labels (see Tags) |
version | string | yes | Revision in vYY.MM.DD format (e.g., v25.04.20) |
setup_link | string | yes | URL to instrumentation/setup instructions |
doc_link | string | no | URL to documentation |
min_interval | duration | no | Minimum query interval (e.g., 1m, 5m) |
time_offset | duration | no | Shifts query time window (e.g., 5m) |
require_metrics | list | no | Metrics that must exist before the template is applied |
table_grid_items | list | no | Summary widgets shown above the table |
table | object | no | Table dashboard definition |
grid_query | string | no | MQL query for grid variables |
grid_variables | list | no | Variable names extracted from grid_query |
include_grid_rows | list | no | References to other templates (see Composition) |
grid_rows | list | no | Grid row definitions |
metric_monitors | list | no | Metric monitors bundled with the template |
Metrics and queries
Metrics and queries appear throughout the template (tables, grid items, monitors). They follow the same pattern everywhere.
Metrics
The metrics field declares metric aliases using MQL syntax:
metrics:
- system_cpu_utilization as $cpu_util
- system_memory_usage as $mem_usage
Each entry takes the form metric_name as $alias. The alias (prefixed with $) is used in queries.
Query
The query field is a list of MQL clauses. Splitting across lines improves readability -- clauses are joined into a single query during processing.
query:
- group by host_name
- avg($cpu_util) as cpu_avg
- sum($mem_usage{state="used"}) as mem_used
- where device !~ "loop"
Common clause patterns:
- Aggregations:
avg($x),sum($x),max($x),min($x),p50($x),p75($x),p90($x),p99($x) - Rate functions:
perMin(sum($x)),perSec(sum($x)) - Counting:
histogram_count($x),uniq($x, attr_name) - Grouping:
group by attr_name,group by attr1, attr2 - Filtering:
where attr = "value",where attr !~ "pattern" - Aliasing:
sum($x) as my_alias - Inline grouping:
avg($x) as y group by state - Attribute filtering:
$metric{state="used"},$metric{direction=read} - Arithmetic:
sum($a) / sum($b) as ratio
Table dashboards
The table section defines a table view where each row leads to the grid dashboard filtered by the row's group-by attributes.
table:
metrics:
- system_cpu_utilization as $cpu_util
- system_memory_usage as $mem_usage
query:
- group by host_name
- avg($cpu_util) as cpu_util
- sum($mem_usage{state="used"}) as mem_used
columns:
cpu_util: { unit: utilization }
mem_used: { unit: bytes }
variables: [deployment_environment, service_name]
Table fields
| Field | Type | Required | Description |
|---|---|---|---|
metrics | list | yes | MQL metric expressions |
query | list | yes | MQL query clauses |
columns | map | no | Per-column display settings (shorthand format) |
overrides | list | no | Per-column visual property overrides (structured format) |
variables | list | no | Query variable names for parameterized filtering |
Column settings (shorthand)
The columns map provides a compact syntax for column formatting:
columns:
cpu_util: { unit: utilization }
mem_used: { unit: bytes, color: red }
availability: { unit: utilization, agg_func: avg }
max_latency: { unit: milliseconds, agg_func: max, display: bar }
Available column properties:
| Property | Description |
|---|---|
unit | Display unit (e.g., bytes, utilization, seconds, nanoseconds, milliseconds, 1, or custom like span/min) |
color | Display color |
agg_func | Aggregation function (e.g., sum, avg, max, last) |
display | Column display mode: value (default), sparkline, or bar. Can also be a map: {mode: bar, min: 0, max: 1} |
Column overrides (structured)
The overrides format provides more control:
overrides:
- column: cpu_util
properties:
- name: unit
value: utilization
- name: color
value: red
Table grid items
table_grid_items are summary widgets displayed above the main table. They support the same grid item types as grid_rows items (gauge, text, chart, table, heatmap).
table_grid_items:
- title: Number of hosts
type: text
metrics:
- system_memory_usage as $mem_usage
query:
- uniq($mem_usage, host_name) as num_host
text: ${num_host} hosts
- title: Memory utilization
type: gauge
metrics:
- system_memory_usage as $mem_usage
query:
- sum($mem_usage{state!="free"}) / sum($mem_usage) as mem_util
columns:
mem_util: { unit: utilization }
Grid dashboards
The grid_rows section defines a classic grid of charts organized in collapsible rows.
grid_rows:
- title: General
items:
- title: CPU utilization
metrics:
- system_cpu_utilization as $cpu_util
query:
- avg($cpu_util)
- title: RAM usage
metrics:
- system_memory_usage as $mem_usage
query:
- sum($mem_usage) group by state
columns:
mem_usage: { unit: bytes }
fill_opacity: 0.1
Each row has a title and a list of items.
Grid item types
Grid items are the building blocks of both grid_rows and table_grid_items. The type field determines the kind of widget. When type is omitted, it defaults to chart.
Common fields
All grid item types share these fields:
| Field | Type | Required | Description |
|---|---|---|---|
title | string | yes | Display heading |
description | string | no | Explanation shown below the title |
type | string | no | One of: chart (default), table, heatmap, gauge, text |
width | int | no | Grid column span |
height | int | no | Grid row span |
x_axis | int | no | Horizontal grid position |
y_axis | int | no | Vertical grid position |
Chart
Charts are the default grid item type. They render time-series data as line, bar, or scatter plots.
- title: CPU time
# type: chart (default, can be omitted)
metrics:
- system_cpu_time as $cpu_time
query:
- perMin(sum($cpu_time)) as cpu_time group by state
fill_opacity: 0.1
Chart-specific fields
| Field | Type | Description |
|---|---|---|
metrics | list | MQL metric expressions |
query | list | MQL query clauses |
chart | string | Chart type: line (default), bar, scatter |
fill_opacity | float | Area fill opacity, 0 to 1 |
stack | string | Stacking mode: "" (none) or "all" |
columns | map | Per-metric display settings with unit and color |
legend | object | Legend configuration (see below) |
properties | list | Visual properties (see Visual properties) |
overrides | list | Per-timeseries overrides (see Timeseries overrides) |
Legend configuration
legend:
type: table # "none", "list", or "table"
placement: bottom # "bottom" or "right"
values: [avg, min, max, last]
items_per_page: 10
Table
Table grid items render query results in a tabular format.
- title: Slowest groups
type: table
metrics:
- uptrace_tracing_spans as $spans
query:
- group by _group_id
- group by _system
- p50($spans)
Table-specific fields
| Field | Type | Description |
|---|---|---|
metrics | list | MQL metric expressions |
query | list | MQL query clauses |
columns | map | Per-column display settings (same as table columns) |
overrides | list | Per-column visual property overrides |
Heatmap
Heatmaps visualize the distribution of a single histogram metric over time.
- title: Span duration heatmap
type: heatmap
metric: uptrace_tracing_spans
unit: milliseconds
Heatmap-specific fields
| Field | Type | Description |
|---|---|---|
metric | string | Single metric name (not an alias) |
unit | string | Display unit |
query | list | Optional MQL query clauses for filtering |
Gauge
Gauges display a single aggregated value, optionally with value mappings for status indicators.
- title: Status
type: gauge
metrics:
- httpcheck_status as $status
query:
- sum($status{http_status_class="2xx"})
value_mappings:
- op: gte
value: 1
text: UP
color: green
- op: eq
value: 0
text: DOWN
color: red
- op: any
text: UNKNOWN
color: gray
Gauge-specific fields
| Field | Type | Description |
|---|---|---|
metrics | list | MQL metric expressions |
query | list | MQL query clauses |
columns | map | Per-column settings with unit and agg_func |
overrides | list | Per-column visual property overrides |
value_mappings | list | Maps values to labels and colors |
Value mappings
Value mappings are evaluated in order. The first matching rule wins.
| Field | Type | Description |
|---|---|---|
op | string | Comparison operator: any, eq, lt, lte, gt, gte |
value | number | Threshold value (not required for any) |
text | string | Display label |
color | string | Display color |
Text
Text items render a Go template string with metric data, useful for summary counts and labels.
- title: Host count
type: text
metrics:
- process_runtime_go_goroutines as $goroutines
query:
- uniq($goroutines, host_name) as num_host
text: ${num_host} hosts
Text-specific fields
| Field | Type | Description |
|---|---|---|
metrics | list | MQL metric expressions |
query | list | MQL query clauses |
text | string | Template string using ${column_name} placeholders |
columns | map | Per-column settings with unit and agg_func |
overrides | list | Per-column visual property overrides |
Visual properties
Visual properties control chart appearance. They can be set via shorthand fields (chart, fill_opacity, stack) or the properties list.
properties:
- name: chartType
value: bar
- name: fillOpacity
value: 0.3
- name: stack
value: all
| Property | Type | Values |
| --------------- | ------- | ------------------------ | ----------- | ------------------- |
| chartType | string | line, bar, scatter |
| stack | string | "" (off), all |
| connectNulls | boolean | true, false |
| lineWidth | number | Line thickness |
| fillOpacity | number | 0 to 1 |
| symbolSize | number | Data point symbol size |
| symbol | string | Symbol name |
| color | string | Color name |
| colorScheme | string | Color scheme name |
| unit | string | Display unit |
| aggFunc | string | Aggregation function |
| columnDisplay | object | {mode: "value" | "sparkline" | "bar", min?, max?} |
Timeseries overrides
Chart items support per-timeseries overrides that apply visual properties to specific series:
overrides:
- matchers:
- target: metric # "metric" or "timeseries"
value: cpu_idle
properties:
- name: unit
value: utilization
- name: color
value: blue
The target field specifies what to match:
metric-- matches by metric/column nametimeseries-- matches by timeseries label
Metric monitors
The metric_monitors section bundles alert monitors with the dashboard:
metric_monitors:
- key: cpu_usage
name: CPU usage
metrics:
- system_cpu_load_average_15m as $load_avg_15m
- system_cpu_time as $cpu_time
query:
- avg($load_avg_15m) / uniq($cpu_time, cpu) as cpu_util
- group by host_name
column_unit: utilization
max_allowed_value: 3
num_eval_points: 10
trend_agg_func: last
Monitor fields
| Field | Type | Required | Description |
|---|---|---|---|
key | string | yes | Unique monitor identifier within the template |
name | string | yes | Human-readable title |
metrics | list | yes | MQL metric expressions |
query | list | yes | MQL query clauses |
status | string | no | active (default) or paused |
column | string | no | Metric column to evaluate |
column_unit | string | no | Display unit for the column |
min_allowed_value | number | no | Lower threshold -- values below this trigger an alert |
max_allowed_value | number | no | Upper threshold -- values above this trigger an alert |
num_eval_points | int | no | Number of recent data points to evaluate |
trend_agg_func | string | no | Aggregation function for trend detection (e.g., last) |
trend_sensitivity | string | no | Sensitivity of trend detection |
bounds_source | string | no | How anomaly bounds are calculated |
resolution | duration | no | Data point interval |
absent_points | string | no | Missing data handling: alert to alert on missing data |
time_offset | duration | no | Shifts the evaluation window |
notify_everyone_by_email | boolean | no | Email all project members on alert |
repeat_interval | object | no | Notification repeat interval configuration |
flapping | object | no | Flapping detection parameters |
Template composition
Templates can include grid rows from other templates using include_grid_rows. This enables modular dashboard composition:
# Parent: uptrace.dotnet.10.all.yml
schema: v2
name: '.NET: All'
tags: [otel, app]
version: v25.04.20
table:
metrics:
- process_runtime_dotnet_gc_heap_size as $heap_size
query:
- group by service_name
- sum($heap_size) as heap_size
include_grid_rows:
- uptrace.dotnet.20.gc
- uptrace.dotnet.30.runtime
- uptrace.dotnet.40.thread_pool
Each entry references another template by filename (without .yml). The child template's grid_rows are appended to the parent's.
Metric requirements
The require_metrics field controls when a template is activated. It appears at the top level of the template:
require_metrics:
- metric: system_cpu_utilization
- library: io.opentelemetry.instrumentation.system
metric: system_cpu_time
| Field | Type | Description |
|---|---|---|
library | string | Instrumentation library name (optional) |
metric | string | Metric name that must exist |
Tags
Tags categorize dashboards for filtering. Use one data source tag and 1-3 category tags.
Data source tags
otel-- OpenTelemetry metricsprom-- Prometheus metrics
Category tags
infra-- System resources (CPU, RAM, disk, network)network-- Network-specific metricsdb-- Database systems (PostgreSQL, MySQL, Redis)app-- Application runtimes (Go, .NET, Java, JVM)k8s-- Kubernetes resourcestracing-- Distributed tracing and spanslogs-- Log aggregationmessaging-- Message queues (Kafka)self_monitoring-- Uptrace internal metrics (no data source tag needed)
tags: [otel, infra] # Host metrics
tags: [otel, app, db] # Go SQL client
tags: [otel, k8s, infra] # Kubernetes infrastructure
tags: [self_monitoring] # Uptrace self-monitoring
Template ID and filename
The template ID is derived from the filename:
uptrace.hostmetrics.10.overview.ymlbecomes IDuptrace.hostmetrics.overview- The numeric suffix (e.g.,
10) is stripped from the ID and used as a display priority
Validation
Templates are validated in two ways:
- YAML parsing with
DisallowUnknownFieldcatches typos and unknown keys. - JSON Schema validation against
schema.jsonensures structural correctness.
To validate templates:
go run cmd/cloud/main.go dashboard validate_templates
Full example
schema: v2
name: 'HTTP Check: Endpoints'
description: Monitors HTTP endpoint availability and response times.
tags: [otel, infra]
version: v25.04.20
setup_link: https://example.com/setup
table_grid_items:
- title: Successful checks
type: text
text: ${num_up} out of ${num_all}
metrics:
- httpcheck_status as $status
query:
- uniq($status{http_status_class="2xx"}) as num_all
- uniq($status{http_status_class="2xx", _value=1}) as num_up
table:
metrics:
- httpcheck_status as $status
- httpcheck_duration as $duration
query:
- group by http_url
- group by host_name
- sum($status{http_status_class="2xx"}) / sum($status) as availability
- avg($duration)
columns:
availability: { unit: utilization, agg_func: avg }
grid_rows:
- title: Gauges
items:
- title: Status
type: gauge
metrics:
- httpcheck_status as $status
query:
- sum($status{http_status_class="2xx"})
value_mappings:
- op: gte
value: 1
text: UP
color: green
- op: eq
value: 0
text: DOWN
color: red
- op: any
text: UNKNOWN
color: gray
- title: General
items:
- title: HTTP check result
metrics:
- httpcheck_status as $status
query:
- $status group by http_status_code
- title: HTTP check duration
metrics:
- httpcheck_duration as $duration
query:
- avg($duration)
- title: Span duration heatmap
type: heatmap
metric: uptrace_tracing_spans
unit: milliseconds
metric_monitors:
- key: http_check_is_down
name: HTTP check is down
metrics:
- httpcheck_status as $status
query:
- sum($status{http_status_class="2xx"}) as status_2xx
- group by http_url
- group by host_name
min_allowed_value: 1
max_allowed_value: 1
num_eval_points: 1
absent_points: alert
trend_agg_func: last