Dashboard YAML Templates

Uptrace dashboards can be defined as YAML templates. Templates are used for built-in dashboards that ship with Uptrace and for importing/exporting user-created dashboards.

When Uptrace receives new metrics, it checks available templates and automatically creates matching dashboards.

Template structure

Every template starts with a schema: v2 header followed by metadata, then optional table, grid_rows, and metric_monitors sections.

yaml
schema: v2
name: 'Hostmetrics: Overview'
description: Tracks host CPU, memory, and system load.
tags: [otel, infra]
version: v25.04.20
setup_link: https://example.com/setup
doc_link: https://example.com/docs

# Optional: minimum query interval
min_interval: 1m

# Optional: shift the query time window
time_offset: 5m

# Optional: only apply when these metrics exist
require_metrics:
  - metric: system_cpu_utilization
  - library: io.opentelemetry.instrumentation.system
    metric: system_cpu_time

table_grid_items:
  # ... summary widgets above the table

table:
  # ... table dashboard definition

grid_rows:
  # ... grid dashboard rows

metric_monitors:
  # ... bundled alert monitors

Top-level fields

FieldTypeRequiredDescription
schemastringyesMust be "v2"
namestringyesHuman-readable dashboard title
descriptionstringnoSummary of the dashboard's purpose
tagslistyesCategorization labels (see Tags)
versionstringyesRevision in vYY.MM.DD format (e.g., v25.04.20)
setup_linkstringyesURL to instrumentation/setup instructions
doc_linkstringnoURL to documentation
min_intervaldurationnoMinimum query interval (e.g., 1m, 5m)
time_offsetdurationnoShifts query time window (e.g., 5m)
require_metricslistnoMetrics that must exist before the template is applied
table_grid_itemslistnoSummary widgets shown above the table
tableobjectnoTable dashboard definition
grid_querystringnoMQL query for grid variables
grid_variableslistnoVariable names extracted from grid_query
include_grid_rowslistnoReferences to other templates (see Composition)
grid_rowslistnoGrid row definitions
metric_monitorslistnoMetric monitors bundled with the template

Metrics and queries

Metrics and queries appear throughout the template (tables, grid items, monitors). They follow the same pattern everywhere.

Metrics

The metrics field declares metric aliases using MQL syntax:

yaml
metrics:
  - system_cpu_utilization as $cpu_util
  - system_memory_usage as $mem_usage

Each entry takes the form metric_name as $alias. The alias (prefixed with $) is used in queries.

Query

The query field is a list of MQL clauses. Splitting across lines improves readability -- clauses are joined into a single query during processing.

yaml
query:
  - group by host_name
  - avg($cpu_util) as cpu_avg
  - sum($mem_usage{state="used"}) as mem_used
  - where device !~ "loop"

Common clause patterns:

  • Aggregations: avg($x), sum($x), max($x), min($x), p50($x), p75($x), p90($x), p99($x)
  • Rate functions: perMin(sum($x)), perSec(sum($x))
  • Counting: histogram_count($x), uniq($x, attr_name)
  • Grouping: group by attr_name, group by attr1, attr2
  • Filtering: where attr = "value", where attr !~ "pattern"
  • Aliasing: sum($x) as my_alias
  • Inline grouping: avg($x) as y group by state
  • Attribute filtering: $metric{state="used"}, $metric{direction=read}
  • Arithmetic: sum($a) / sum($b) as ratio

Table dashboards

The table section defines a table view where each row leads to the grid dashboard filtered by the row's group-by attributes.

yaml
table:
  metrics:
    - system_cpu_utilization as $cpu_util
    - system_memory_usage as $mem_usage
  query:
    - group by host_name
    - avg($cpu_util) as cpu_util
    - sum($mem_usage{state="used"}) as mem_used
  columns:
    cpu_util: { unit: utilization }
    mem_used: { unit: bytes }
  variables: [deployment_environment, service_name]

Table fields

FieldTypeRequiredDescription
metricslistyesMQL metric expressions
querylistyesMQL query clauses
columnsmapnoPer-column display settings (shorthand format)
overrideslistnoPer-column visual property overrides (structured format)
variableslistnoQuery variable names for parameterized filtering

Column settings (shorthand)

The columns map provides a compact syntax for column formatting:

yaml
columns:
  cpu_util: { unit: utilization }
  mem_used: { unit: bytes, color: red }
  availability: { unit: utilization, agg_func: avg }
  max_latency: { unit: milliseconds, agg_func: max, display: bar }

Available column properties:

PropertyDescription
unitDisplay unit (e.g., bytes, utilization, seconds, nanoseconds, milliseconds, 1, or custom like span/min)
colorDisplay color
agg_funcAggregation function (e.g., sum, avg, max, last)
displayColumn display mode: value (default), sparkline, or bar. Can also be a map: {mode: bar, min: 0, max: 1}

Column overrides (structured)

The overrides format provides more control:

yaml
overrides:
  - column: cpu_util
    properties:
      - name: unit
        value: utilization
      - name: color
        value: red

Table grid items

table_grid_items are summary widgets displayed above the main table. They support the same grid item types as grid_rows items (gauge, text, chart, table, heatmap).

yaml
table_grid_items:
  - title: Number of hosts
    type: text
    metrics:
      - system_memory_usage as $mem_usage
    query:
      - uniq($mem_usage, host_name) as num_host
    text: ${num_host} hosts

  - title: Memory utilization
    type: gauge
    metrics:
      - system_memory_usage as $mem_usage
    query:
      - sum($mem_usage{state!="free"}) / sum($mem_usage) as mem_util
    columns:
      mem_util: { unit: utilization }

Grid dashboards

The grid_rows section defines a classic grid of charts organized in collapsible rows.

yaml
grid_rows:
  - title: General
    items:
      - title: CPU utilization
        metrics:
          - system_cpu_utilization as $cpu_util
        query:
          - avg($cpu_util)

      - title: RAM usage
        metrics:
          - system_memory_usage as $mem_usage
        query:
          - sum($mem_usage) group by state
        columns:
          mem_usage: { unit: bytes }
        fill_opacity: 0.1

Each row has a title and a list of items.

Grid item types

Grid items are the building blocks of both grid_rows and table_grid_items. The type field determines the kind of widget. When type is omitted, it defaults to chart.

Common fields

All grid item types share these fields:

FieldTypeRequiredDescription
titlestringyesDisplay heading
descriptionstringnoExplanation shown below the title
typestringnoOne of: chart (default), table, heatmap, gauge, text
widthintnoGrid column span
heightintnoGrid row span
x_axisintnoHorizontal grid position
y_axisintnoVertical grid position

Chart

Charts are the default grid item type. They render time-series data as line, bar, or scatter plots.

yaml
- title: CPU time
  # type: chart (default, can be omitted)
  metrics:
    - system_cpu_time as $cpu_time
  query:
    - perMin(sum($cpu_time)) as cpu_time group by state
  fill_opacity: 0.1

Chart-specific fields

FieldTypeDescription
metricslistMQL metric expressions
querylistMQL query clauses
chartstringChart type: line (default), bar, scatter
fill_opacityfloatArea fill opacity, 0 to 1
stackstringStacking mode: "" (none) or "all"
columnsmapPer-metric display settings with unit and color
legendobjectLegend configuration (see below)
propertieslistVisual properties (see Visual properties)
overrideslistPer-timeseries overrides (see Timeseries overrides)

Legend configuration

yaml
legend:
  type: table     # "none", "list", or "table"
  placement: bottom # "bottom" or "right"
  values: [avg, min, max, last]
  items_per_page: 10

Table

Table grid items render query results in a tabular format.

yaml
- title: Slowest groups
  type: table
  metrics:
    - uptrace_tracing_spans as $spans
  query:
    - group by _group_id
    - group by _system
    - p50($spans)

Table-specific fields

FieldTypeDescription
metricslistMQL metric expressions
querylistMQL query clauses
columnsmapPer-column display settings (same as table columns)
overrideslistPer-column visual property overrides

Heatmap

Heatmaps visualize the distribution of a single histogram metric over time.

yaml
- title: Span duration heatmap
  type: heatmap
  metric: uptrace_tracing_spans
  unit: milliseconds

Heatmap-specific fields

FieldTypeDescription
metricstringSingle metric name (not an alias)
unitstringDisplay unit
querylistOptional MQL query clauses for filtering

Gauge

Gauges display a single aggregated value, optionally with value mappings for status indicators.

yaml
- title: Status
  type: gauge
  metrics:
    - httpcheck_status as $status
  query:
    - sum($status{http_status_class="2xx"})
  value_mappings:
    - op: gte
      value: 1
      text: UP
      color: green
    - op: eq
      value: 0
      text: DOWN
      color: red
    - op: any
      text: UNKNOWN
      color: gray

Gauge-specific fields

FieldTypeDescription
metricslistMQL metric expressions
querylistMQL query clauses
columnsmapPer-column settings with unit and agg_func
overrideslistPer-column visual property overrides
value_mappingslistMaps values to labels and colors

Value mappings

Value mappings are evaluated in order. The first matching rule wins.

FieldTypeDescription
opstringComparison operator: any, eq, lt, lte, gt, gte
valuenumberThreshold value (not required for any)
textstringDisplay label
colorstringDisplay color

Text

Text items render a Go template string with metric data, useful for summary counts and labels.

yaml
- title: Host count
  type: text
  metrics:
    - process_runtime_go_goroutines as $goroutines
  query:
    - uniq($goroutines, host_name) as num_host
  text: ${num_host} hosts

Text-specific fields

FieldTypeDescription
metricslistMQL metric expressions
querylistMQL query clauses
textstringTemplate string using ${column_name} placeholders
columnsmapPer-column settings with unit and agg_func
overrideslistPer-column visual property overrides

Visual properties

Visual properties control chart appearance. They can be set via shorthand fields (chart, fill_opacity, stack) or the properties list.

yaml
properties:
  - name: chartType
    value: bar
  - name: fillOpacity
    value: 0.3
  - name: stack
    value: all

| Property | Type | Values |
| --------------- | ------- | ------------------------ | ----------- | ------------------- |
| chartType | string | line, bar, scatter |
| stack | string | "" (off), all |
| connectNulls | boolean | true, false |
| lineWidth | number | Line thickness |
| fillOpacity | number | 0 to 1 |
| symbolSize | number | Data point symbol size |
| symbol | string | Symbol name |
| color | string | Color name |
| colorScheme | string | Color scheme name |
| unit | string | Display unit |
| aggFunc | string | Aggregation function |
| columnDisplay | object | {mode: "value" | "sparkline" | "bar", min?, max?} |

Timeseries overrides

Chart items support per-timeseries overrides that apply visual properties to specific series:

yaml
overrides:
  - matchers:
      - target: metric   # "metric" or "timeseries"
        value: cpu_idle
    properties:
      - name: unit
        value: utilization
      - name: color
        value: blue

The target field specifies what to match:

  • metric -- matches by metric/column name
  • timeseries -- matches by timeseries label

Metric monitors

The metric_monitors section bundles alert monitors with the dashboard:

yaml
metric_monitors:
  - key: cpu_usage
    name: CPU usage
    metrics:
      - system_cpu_load_average_15m as $load_avg_15m
      - system_cpu_time as $cpu_time
    query:
      - avg($load_avg_15m) / uniq($cpu_time, cpu) as cpu_util
      - group by host_name
    column_unit: utilization
    max_allowed_value: 3
    num_eval_points: 10
    trend_agg_func: last

Monitor fields

FieldTypeRequiredDescription
keystringyesUnique monitor identifier within the template
namestringyesHuman-readable title
metricslistyesMQL metric expressions
querylistyesMQL query clauses
statusstringnoactive (default) or paused
columnstringnoMetric column to evaluate
column_unitstringnoDisplay unit for the column
min_allowed_valuenumbernoLower threshold -- values below this trigger an alert
max_allowed_valuenumbernoUpper threshold -- values above this trigger an alert
num_eval_pointsintnoNumber of recent data points to evaluate
trend_agg_funcstringnoAggregation function for trend detection (e.g., last)
trend_sensitivitystringnoSensitivity of trend detection
bounds_sourcestringnoHow anomaly bounds are calculated
resolutiondurationnoData point interval
absent_pointsstringnoMissing data handling: alert to alert on missing data
time_offsetdurationnoShifts the evaluation window
notify_everyone_by_emailbooleannoEmail all project members on alert
repeat_intervalobjectnoNotification repeat interval configuration
flappingobjectnoFlapping detection parameters

Template composition

Templates can include grid rows from other templates using include_grid_rows. This enables modular dashboard composition:

yaml
# Parent: uptrace.dotnet.10.all.yml
schema: v2
name: '.NET: All'
tags: [otel, app]
version: v25.04.20

table:
  metrics:
    - process_runtime_dotnet_gc_heap_size as $heap_size
  query:
    - group by service_name
    - sum($heap_size) as heap_size

include_grid_rows:
  - uptrace.dotnet.20.gc
  - uptrace.dotnet.30.runtime
  - uptrace.dotnet.40.thread_pool

Each entry references another template by filename (without .yml). The child template's grid_rows are appended to the parent's.

Metric requirements

The require_metrics field controls when a template is activated. It appears at the top level of the template:

yaml
require_metrics:
  - metric: system_cpu_utilization
  - library: io.opentelemetry.instrumentation.system
    metric: system_cpu_time
FieldTypeDescription
librarystringInstrumentation library name (optional)
metricstringMetric name that must exist

Tags

Tags categorize dashboards for filtering. Use one data source tag and 1-3 category tags.

Data source tags

  • otel -- OpenTelemetry metrics
  • prom -- Prometheus metrics

Category tags

  • infra -- System resources (CPU, RAM, disk, network)
  • network -- Network-specific metrics
  • db -- Database systems (PostgreSQL, MySQL, Redis)
  • app -- Application runtimes (Go, .NET, Java, JVM)
  • k8s -- Kubernetes resources
  • tracing -- Distributed tracing and spans
  • logs -- Log aggregation
  • messaging -- Message queues (Kafka)
  • self_monitoring -- Uptrace internal metrics (no data source tag needed)
yaml
tags: [otel, infra]         # Host metrics
tags: [otel, app, db]       # Go SQL client
tags: [otel, k8s, infra]    # Kubernetes infrastructure
tags: [self_monitoring]      # Uptrace self-monitoring

Template ID and filename

The template ID is derived from the filename:

  • uptrace.hostmetrics.10.overview.yml becomes ID uptrace.hostmetrics.overview
  • The numeric suffix (e.g., 10) is stripped from the ID and used as a display priority

Validation

Templates are validated in two ways:

  1. YAML parsing with DisallowUnknownField catches typos and unknown keys.
  2. JSON Schema validation against schema.json ensures structural correctness.

To validate templates:

bash
go run cmd/cloud/main.go dashboard validate_templates

Full example

yaml
schema: v2
name: 'HTTP Check: Endpoints'
description: Monitors HTTP endpoint availability and response times.
tags: [otel, infra]
version: v25.04.20
setup_link: https://example.com/setup

table_grid_items:
  - title: Successful checks
    type: text
    text: ${num_up} out of ${num_all}
    metrics:
      - httpcheck_status as $status
    query:
      - uniq($status{http_status_class="2xx"}) as num_all
      - uniq($status{http_status_class="2xx", _value=1}) as num_up

table:
  metrics:
    - httpcheck_status as $status
    - httpcheck_duration as $duration
  query:
    - group by http_url
    - group by host_name
    - sum($status{http_status_class="2xx"}) / sum($status) as availability
    - avg($duration)
  columns:
    availability: { unit: utilization, agg_func: avg }

grid_rows:
  - title: Gauges
    items:
      - title: Status
        type: gauge
        metrics:
          - httpcheck_status as $status
        query:
          - sum($status{http_status_class="2xx"})
        value_mappings:
          - op: gte
            value: 1
            text: UP
            color: green
          - op: eq
            value: 0
            text: DOWN
            color: red
          - op: any
            text: UNKNOWN
            color: gray

  - title: General
    items:
      - title: HTTP check result
        metrics:
          - httpcheck_status as $status
        query:
          - $status group by http_status_code

      - title: HTTP check duration
        metrics:
          - httpcheck_duration as $duration
        query:
          - avg($duration)

      - title: Span duration heatmap
        type: heatmap
        metric: uptrace_tracing_spans
        unit: milliseconds

metric_monitors:
  - key: http_check_is_down
    name: HTTP check is down
    metrics:
      - httpcheck_status as $status
    query:
      - sum($status{http_status_class="2xx"}) as status_2xx
      - group by http_url
      - group by host_name
    min_allowed_value: 1
    max_allowed_value: 1
    num_eval_points: 1
    absent_points: alert
    trend_agg_func: last