Error monitors

An error monitor watches incoming log records and fires an alert when a new matching pattern appears. Unlike metric monitors that evaluate aggregates over time, error monitors react to individual events — making them ideal for catching exceptions, fatal errors, and unexpected patterns the moment they occur.

Uptrace automatically creates a default error monitor for logs with ERROR and FATAL severity. You can customize its filters, add grouping dimensions, or create additional monitors for other patterns.

How it works

When a log record matches the monitor's query, Uptrace checks whether an alert already exists for that pattern. Grouping uses the same log grouping logic as the Logs explorer — similar messages share one alert. If no alert exists, it creates one and sends a notification. If the same pattern fires again, Uptrace uses the notification frequency rules to decide whether to send a reminder.

When the pattern stops appearing, the alert is automatically closed and a recovery notification is sent.

Examples

All errors and fatal logs:

yaml
monitors:
  - name: Notify on all errors
    type: error
    notify_everyone_by_email: true
    query:
      - group by _group_id
      - where _system in ("log:error", "log:fatal")

Errors matching a specific message:

yaml
monitors:
  - name: Notify on "timeout" errors
    type: error
    notify_everyone_by_email: true
    query:
      - group by _group_id
      - where _system in ("log:error", "log:fatal")
      - where _display_name contains "timeout"

Exceptions only:

yaml
monitors:
  - name: Exceptions
    type: error
    notify_everyone_by_email: true
    query:
      - group by _group_id
      - where _system in ("log:error", "log:fatal")
      - where exception_type exists

All environments except dev:

yaml
monitors:
  - name: Notify on all errors except in "dev" environment
    type: error
    notify_everyone_by_email: true
    query:
      - group by _group_id
      - group by deployment_environment
      - where _system in ("log:error", "log:fatal")
      - where deployment_environment != "dev"

Query attributes

AttributeDescription
_group_idGroups similar log records together. Use group by _group_id to create one alert per unique error pattern.
_display_nameThe log message or exception message. Use contains to match a substring.
_systemThe signal type: log:error, log:fatal, log:warn, etc.
exception_typeSet on log records that carry an exception. Use exists to filter exception-only logs.
deployment_environmentThe environment attribute, typically prod, staging, or dev.
service_nameThe service that emitted the log.

Any indexed span attribute can be used in where and group by clauses.

Alert names

For metric monitors, Uptrace generates alert names using the monitor name and timeseries name, for example, "Disk usage: myhost+mydisk".

For error monitors, Uptrace generates alert names using the error (log) message, for example, "ERROR *fmt.wrapError: writeError failed".

You can customize alert names by specifying a Go template string as the monitor name when creating a monitor, for example, {{ .Attrs.deployment_environment_name }}: {{ .DisplayName }} will prefix the alert name with the deployment environment attribute.

You can use the following variables in templates:

VariableTypeDescription
{{ .DisplayName }}stringSame as _display_name when querying spans and logs.
{{ .Attrs }}mapstringanyAll available attributes, for example, {{ .Attrs.service_name }}.