Dashboard YAML Templates

Uptrace dashboards can be defined as YAML templates. Templates are used for built-in dashboards that ship with Uptrace and for importing/exporting user-created dashboards.

When Uptrace receives new metrics, it checks available templates and automatically creates matching dashboards.

Template structure

Every template starts with a schema: v2 header followed by metadata, then optional table, grid_sections, and metric_monitors sections.

yaml

schema: v2
name: 'Hostmetrics: Overview'
description: Tracks host CPU, memory, and system load.
tags: [otel, infra]
version: v25.04.20
setup_link: https://example.com/setup
doc_link: https://example.com/docs

# Optional: minimum query interval
min_interval: 1m

# Optional: shift the query time window
time_offset: 5m

# Optional: only apply when these metrics exist
require_metrics:
  - metric: system_cpu_utilization
  - library: io.opentelemetry.instrumentation.system
    metric: system_cpu_time

table_grid_items:
  # ... summary widgets above the table

table:
  # ... table dashboard definition

grid_sections:
  # ... grid dashboard rows

metric_monitors:
  # ... bundled alert monitors

Top-level fields

Field	Type	Required	Description
`schema`	string	yes	Must be `"v2"`
`name`	string	yes	Human-readable dashboard title
`description`	string	no	Summary of the dashboard's purpose
`tags`	list	yes	Categorization labels (see Tags)
`version`	string	yes	Revision in `vYY.MM.DD` format (e.g., `v25.04.20`)
`setup_link`	string	yes	URL to instrumentation/setup instructions
`doc_link`	string	no	URL to documentation
`min_interval`	duration	no	Minimum query interval (e.g., `1m`, `5m`)
`time_offset`	duration	no	Shifts query time window (e.g., `5m`)
`require_metrics`	list	no	Metrics that must exist before the template is applied
`table_grid_items`	list	no	Summary widgets shown above the table
`table`	object	no	Table dashboard definition
`grid_query`	string	no	MQL query for grid variables
`grid_variables`	list	no	Variable names extracted from `grid_query`
`include_grid_sections`	list	no	References to other templates (see Composition)
`grid_sections`	list	no	Grid section definitions
`metric_monitors`	list	no	Metric monitors bundled with the template

Metrics and queries

Metrics and queries appear throughout the template (tables, grid items, monitors). They follow the same pattern everywhere.

Metrics

The metrics field declares metric aliases using MQL syntax:

yaml

metrics:
  - system_cpu_utilization as $cpu_util
  - system_memory_usage as $mem_usage

Each entry takes the form metric_name as $alias. The alias (prefixed with $) is used in queries.

Query

The query field is a list of MQL clauses. Splitting across lines improves readability -- clauses are joined into a single query during processing.

yaml

query:
  - group by host_name
  - avg($cpu_util) as cpu_avg
  - sum($mem_usage{state="used"}) as mem_used
  - where device !~ "loop"

Common clause patterns:

Aggregations: avg($x), sum($x), max($x), min($x), p50($x), p75($x), p90($x), p99($x)
Rate functions: perMin(sum($x)), perSec(sum($x))
Counting: histogram_count($x), uniq($x, attr_name)
Grouping: group by attr_name, group by attr1, attr2
Filtering: where attr = "value", where attr !~ "pattern"
Aliasing: sum($x) as my_alias
Inline grouping: avg($x) as y group by state
Attribute filtering: $metric{state="used"}, $metric{direction=read}
Arithmetic: sum($a) / sum($b) as ratio

Table dashboards

The table section defines a table view where each row leads to the grid dashboard filtered by the row's group-by attributes.

yaml

table:
  metrics:
    - system_cpu_utilization as $cpu_util
    - system_memory_usage as $mem_usage
  query:
    - group by host_name
    - avg($cpu_util) as cpu_util
    - sum($mem_usage{state="used"}) as mem_used
  columns:
    cpu_util: { unit: utilization }
    mem_used: { unit: bytes }
  variables: [deployment_environment, service_name]

Table fields

Field	Type	Required	Description
`metrics`	list	yes	MQL metric expressions
`query`	list	yes	MQL query clauses
`columns`	map	no	Per-column display settings (shorthand format)
`overrides`	list	no	Per-column visual property overrides (structured format)
`variables`	list	no	Query variable names for parameterized filtering

Column settings (shorthand)

The columns map provides a compact syntax for column formatting:

yaml

columns:
  cpu_util: { unit: utilization }
  mem_used: { unit: bytes, color: red }
  availability: { unit: utilization, agg_func: avg }
  max_latency: { unit: milliseconds, agg_func: max, display: bar }

Available column properties:

Property	Description
`unit`	Display unit (e.g., `bytes`, `utilization`, `seconds`, `nanoseconds`, `milliseconds`, `1`, or custom like `span/min`)
`color`	Display color
`agg_func`	Aggregation function (e.g., `sum`, `avg`, `max`, `last`)
`display`	Column display mode: `value` (default), `sparkline`, or `bar`. Can also be a map: `{mode: bar, min: 0, max: 1}`

Column overrides (structured)

The overrides format provides more control:

yaml

overrides:
  - column: cpu_util
    properties:
      - name: unit
        value: utilization
      - name: color
        value: red

Table grid items

table_grid_items are summary widgets displayed above the main table. They support the same grid item types as grid_sections items (gauge, text, chart, table, heatmap).

yaml

table_grid_items:
  - title: Number of hosts
    type: text
    metrics:
      - system_memory_usage as $mem_usage
    query:
      - uniq($mem_usage, host_name) as num_host
    text: ${num_host} hosts

  - title: Memory utilization
    type: gauge
    metrics:
      - system_memory_usage as $mem_usage
    query:
      - sum($mem_usage{state!="free"}) / sum($mem_usage) as mem_util
    columns:
      mem_util: { unit: utilization }

Grid dashboards

The grid_sections section defines a classic grid of charts organized in collapsible rows.

yaml

grid_sections:
  - title: General
    items:
      - title: CPU utilization
        metrics:
          - system_cpu_utilization as $cpu_util
        query:
          - avg($cpu_util)

      - title: RAM usage
        metrics:
          - system_memory_usage as $mem_usage
        query:
          - sum($mem_usage) group by state
        columns:
          mem_usage: { unit: bytes }
        fill_opacity: 0.1

Each row has a title and a list of items.

Grid item types

Grid items are the building blocks of both grid_sections and table_grid_items. The type field determines the kind of widget. When type is omitted, it defaults to chart.

Common fields

All grid item types share these fields:

Field	Type	Required	Description
`title`	string	yes	Display heading
`description`	string	no	Explanation shown below the title
`type`	string	no	One of: `chart` (default), `table`, `heatmap`, `gauge`, `text`
`width`	int	no	Grid column span
`height`	int	no	Grid section span
`x_axis`	int	no	Horizontal grid position
`y_axis`	int	no	Vertical grid position

Chart

Charts are the default grid item type. They render time-series data as line, bar, or scatter plots.

yaml

- title: CPU time
  # type: chart (default, can be omitted)
  metrics:
    - system_cpu_time as $cpu_time
  query:
    - perMin(sum($cpu_time)) as cpu_time group by state
  fill_opacity: 0.1

Chart-specific fields

Field	Type	Description
`metrics`	list	MQL metric expressions
`query`	list	MQL query clauses
`chart`	string	Chart type: `line` (default), `bar`, `scatter`
`fill_opacity`	float	Area fill opacity, 0 to 1
`stack`	string	Stacking mode: `""` (none) or `"all"`
`columns`	map	Per-metric display settings with `unit` and `color`
`legend`	object	Legend configuration (see below)
`properties`	list	Visual properties (see Visual properties)
`overrides`	list	Per-timeseries overrides (see Timeseries overrides)

Legend configuration

yaml

legend:
  type: table     # "none", "list", or "table"
  placement: bottom # "bottom" or "right"
  values: [avg, min, max, last]
  items_per_page: 10

Table

Table grid items render query results in a tabular format.

yaml

- title: Slowest groups
  type: table
  metrics:
    - uptrace_tracing_spans as $spans
  query:
    - group by _group_id
    - group by _system
    - p50($spans)

Table-specific fields

Field	Type	Description
`metrics`	list	MQL metric expressions
`query`	list	MQL query clauses
`columns`	map	Per-column display settings (same as table columns)
`overrides`	list	Per-column visual property overrides

Heatmap

Heatmaps visualize the distribution of a single histogram metric over time.

yaml

- title: Span duration heatmap
  type: heatmap
  metric: uptrace_tracing_spans
  unit: milliseconds

Heatmap-specific fields

Field	Type	Description
`metric`	string	Single metric name (not an alias)
`unit`	string	Display unit
`query`	list	Optional MQL query clauses for filtering

Gauge

Gauges display a single aggregated value, optionally with value mappings for status indicators.

yaml

- title: Status
  type: gauge
  metrics:
    - httpcheck_status as $status
  query:
    - sum($status{http_status_class="2xx"})
  value_mappings:
    - op: gte
      value: 1
      text: UP
      color: green
    - op: eq
      value: 0
      text: DOWN
      color: red
    - op: any
      text: UNKNOWN
      color: gray

Gauge-specific fields

Field	Type	Description
`metrics`	list	MQL metric expressions
`query`	list	MQL query clauses
`columns`	map	Per-column settings with `unit` and `agg_func`
`overrides`	list	Per-column visual property overrides
`value_mappings`	list	Maps values to labels and colors

Value mappings

Value mappings are evaluated in order. The first matching rule wins.

Field	Type	Description
`op`	string	Comparison operator: `any`, `eq`, `lt`, `lte`, `gt`, `gte`
`value`	number	Threshold value (not required for `any`)
`text`	string	Display label
`color`	string	Display color

Text

Text items render a Go template string with metric data, useful for summary counts and labels.

yaml

- title: Host count
  type: text
  metrics:
    - process_runtime_go_goroutines as $goroutines
  query:
    - uniq($goroutines, host_name) as num_host
  text: ${num_host} hosts

Text-specific fields

Field	Type	Description
`metrics`	list	MQL metric expressions
`query`	list	MQL query clauses
`text`	string	Template string using `${column_name}` placeholders
`columns`	map	Per-column settings with `unit` and `agg_func`
`overrides`	list	Per-column visual property overrides

Visual properties

Visual properties control chart appearance. They can be set via shorthand fields (chart, fill_opacity, stack) or the properties list.

yaml

properties:
  - name: chartType
    value: bar
  - name: fillOpacity
    value: 0.3
  - name: stack
    value: all

| Property | Type | Values |
| --------------- | ------- | ------------------------ | ----------- | ------------------- |
| chartType | string | line, bar, scatter |
| stack | string | "" (off), all |
| connectNulls | boolean | true, false |
| lineWidth | number | Line thickness |
| fillOpacity | number | 0 to 1 |
| symbolSize | number | Data point symbol size |
| symbol | string | Symbol name |
| color | string | Color name |
| colorScheme | string | Color scheme name |
| unit | string | Display unit |
| aggFunc | string | Aggregation function |
| columnDisplay | object | {mode: "value" | "sparkline" | "bar", min?, max?} |

Timeseries overrides

Chart items support per-timeseries overrides that apply visual properties to specific series:

yaml

overrides:
  - matchers:
      - target: metric   # "metric" or "timeseries"
        value: cpu_idle
    properties:
      - name: unit
        value: utilization
      - name: color
        value: blue

The target field specifies what to match:

metric -- matches by metric/column name
timeseries -- matches by timeseries label

Metric monitors

The metric_monitors section bundles alert monitors with the dashboard:

yaml

metric_monitors:
  - key: cpu_usage
    name: CPU usage
    metrics:
      - system_cpu_load_average_15m as $load_avg_15m
      - system_cpu_time as $cpu_time
    query:
      - avg($load_avg_15m) / uniq($cpu_time, cpu) as cpu_util
      - group by host_name
    column:
      unit: utilization
    detector:
      type: manual
      max_value: 3
    num_eval_points: 10
    trend_agg_func: last

Monitor fields

Field	Type	Required	Description
`key`	string	yes	Unique monitor identifier within the template
`name`	string	yes	Human-readable title
`metrics`	list	yes	MQL metric expressions
`query`	list	yes	MQL query clauses
`status`	string	no	`active` (default) or `paused`
`column`	object	no	Column to evaluate: `{name: "col", unit: "ms"}`
`detector`	object	no	Detector config (see below)
`num_eval_points`	int	no	Number of recent data points to evaluate
`trend_agg_func`	string	no	Aggregation function for trend detection (e.g., `last`)
`trend_sensitivity`	string	no	Sensitivity of trend detection
`resolution`	duration	no	Data point interval
`absent_points`	string	no	Missing data handling: `alert` to alert on missing data
`time_offset`	duration	no	Shifts the evaluation window
`notify_everyone_by_email`	boolean	no	Email all project members on alert
`repeat_interval`	object	no	Notification repeat interval configuration

Manual detector

yaml

detector:
  type: manual
  min_value: 100       # lower threshold (at least one of min/max required)
  max_value: 1000      # upper threshold
  recovery:            # optional: conditions to resolve the alert
    min_value: 90
    max_value: 1100

Auto detector

yaml

detector:
  type: auto
  tolerance: medium       # low, medium, or high
  training_period: 24h    # historical data window
  min_dev_fraction: 0.2   # optional: min deviation as fraction of expected value
  min_dev_absolute: 10    # optional: min absolute deviation

Template composition

Templates can include grid sections from other templates using include_grid_sections. This enables modular dashboard composition:

yaml

# Parent: uptrace.dotnet.10.all.yml
schema: v2
name: '.NET: All'
tags: [otel, app]
version: v25.04.20

table:
  metrics:
    - process_runtime_dotnet_gc_heap_size as $heap_size
  query:
    - group by service_name
    - sum($heap_size) as heap_size

include_grid_sections:
  - uptrace.dotnet.20.gc
  - uptrace.dotnet.30.runtime
  - uptrace.dotnet.40.thread_pool

Each entry references another template by filename (without .yml). The child template's grid_sections are appended to the parent's.

Metric requirements

The require_metrics field controls when a template is activated. It appears at the top level of the template:

yaml

require_metrics:
  - metric: system_cpu_utilization
  - library: io.opentelemetry.instrumentation.system
    metric: system_cpu_time

Field	Type	Description
`library`	string	Instrumentation library name (optional)
`metric`	string	Metric name that must exist

Template ID and filename

The template ID is derived from the filename:

uptrace.hostmetrics.10.overview.yml becomes ID uptrace.hostmetrics.overview
The numeric suffix (e.g., 10) is stripped from the ID and used as a display priority

Validation

Templates are validated in two ways:

YAML parsing with DisallowUnknownField catches typos and unknown keys.
JSON Schema validation against schema.json ensures structural correctness.

To validate templates:

bash

go run cmd/cloud/main.go dashboard validate_templates

Full example

yaml

schema: v2
name: 'HTTP Check: Endpoints'
description: Monitors HTTP endpoint availability and response times.
tags: [otel, infra]
version: v25.04.20
setup_link: https://example.com/setup

table_grid_items:
  - title: Successful checks
    type: text
    text: ${num_up} out of ${num_all}
    metrics:
      - httpcheck_status as $status
    query:
      - uniq($status{http_status_class="2xx"}) as num_all
      - uniq($status{http_status_class="2xx", _value=1}) as num_up

table:
  metrics:
    - httpcheck_status as $status
    - httpcheck_duration as $duration
  query:
    - group by http_url
    - group by host_name
    - sum($status{http_status_class="2xx"}) / sum($status) as availability
    - avg($duration)
  columns:
    availability: { unit: utilization, agg_func: avg }

grid_sections:
  - title: Gauges
    items:
      - title: Status
        type: gauge
        metrics:
          - httpcheck_status as $status
        query:
          - sum($status{http_status_class="2xx"})
        value_mappings:
          - op: gte
            value: 1
            text: UP
            color: green
          - op: eq
            value: 0
            text: DOWN
            color: red
          - op: any
            text: UNKNOWN
            color: gray

  - title: General
    items:
      - title: HTTP check result
        metrics:
          - httpcheck_status as $status
        query:
          - $status group by http_status_code

      - title: HTTP check duration
        metrics:
          - httpcheck_duration as $duration
        query:
          - avg($duration)

      - title: Span duration heatmap
        type: heatmap
        metric: uptrace_tracing_spans
        unit: milliseconds

metric_monitors:
  - key: http_check_is_down
    name: HTTP check is down
    metrics:
      - httpcheck_status as $status
    query:
      - sum($status{http_status_class="2xx"}) as status_2xx
      - group by http_url
      - group by host_name
    detector:
      type: manual
      min_value: 1
      max_value: 1
    num_eval_points: 1
    absent_points: alert
    trend_agg_func: last