Error monitors

An error monitor watches incoming log records and fires an alert when a new matching pattern appears. Unlike metric monitors that evaluate aggregates over time, error monitors react to individual events — making them ideal for catching exceptions, fatal errors, and unexpected patterns the moment they occur.

Uptrace automatically creates a default error monitor for logs with ERROR and FATAL severity. You can customize its filters, add grouping dimensions, or create additional monitors for other patterns.

How it works

When a log record matches the monitor's query, Uptrace checks whether an alert already exists for that pattern. Grouping uses the same log grouping logic as the Logs explorer — similar messages share one alert. If no alert exists, it creates one and sends a notification. If the same pattern fires again, Uptrace uses the notification frequency rules to decide whether to send a reminder.

When the pattern stops appearing, the alert is automatically closed and a recovery notification is sent.

Examples

All errors and fatal logs:

yaml

monitors:
  - name: Notify on all errors
    type: error
    notify_everyone_by_email: true
    query:
      - group by _group_id
      - where _system in ("log:error", "log:fatal")

Errors matching a specific message:

yaml

monitors:
  - name: Notify on "timeout" errors
    type: error
    notify_everyone_by_email: true
    query:
      - group by _group_id
      - where _system in ("log:error", "log:fatal")
      - where _display_name contains "timeout"

Exceptions only:

yaml

monitors:
  - name: Exceptions
    type: error
    notify_everyone_by_email: true
    query:
      - group by _group_id
      - where _system in ("log:error", "log:fatal")
      - where exception_type exists

All environments except dev:

yaml

monitors:
  - name: Notify on all errors except in "dev" environment
    type: error
    notify_everyone_by_email: true
    query:
      - group by _group_id
      - group by deployment_environment
      - where _system in ("log:error", "log:fatal")
      - where deployment_environment != "dev"

Query attributes

Attribute	Description
`_group_id`	Groups similar log records together. Use `group by _group_id` to create one alert per unique error pattern.
`_display_name`	The log message or exception message. Use `contains` to match a substring.
`_system`	The signal type: `log:error`, `log:fatal`, `log:warn`, etc.
`exception_type`	Set on log records that carry an exception. Use `exists` to filter exception-only logs.
`deployment_environment`	The environment attribute, typically `prod`, `staging`, or `dev`.
`service_name`	The service that emitted the log.

Any indexed span attribute can be used in where and group by clauses.

Alert names

For metric monitors, Uptrace generates alert names using the monitor name and timeseries name, for example, "Disk usage: myhost+mydisk".

For error monitors, Uptrace generates alert names using the error (log) message, for example, "ERROR *fmt.wrapError: writeError failed".

You can customize alert names by specifying a Go template string as the monitor name when creating a monitor, for example, {{ .Attrs.deployment_environment_name }}: {{ .DisplayName }} will prefix the alert name with the deployment environment attribute.

You can use the following variables in templates:

Variable	Type	Description
`{{ .DisplayName }}`	string	Same as `_display_name` when querying spans and logs.
`{{ .Attrs }}`	mapstringany	All available attributes, for example, `{{ .Attrs.service_name }}`.