Error monitors
An error monitor watches incoming log records and fires an alert when a new matching pattern appears. Unlike metric monitors that evaluate aggregates over time, error monitors react to individual events — making them ideal for catching exceptions, fatal errors, and unexpected patterns the moment they occur.
Uptrace automatically creates a default error monitor for logs with ERROR and FATAL severity. You can customize its filters, add grouping dimensions, or create additional monitors for other patterns.
How it works
When a log record matches the monitor's query, Uptrace checks whether an alert already exists for that pattern. Grouping uses the same log grouping logic as the Logs explorer — similar messages share one alert. If no alert exists, it creates one and sends a notification. If the same pattern fires again, Uptrace uses the notification frequency rules to decide whether to send a reminder.
When the pattern stops appearing, the alert is automatically closed and a recovery notification is sent.
Examples
All errors and fatal logs:
monitors:
- name: Notify on all errors
type: error
notify_everyone_by_email: true
query:
- group by _group_id
- where _system in ("log:error", "log:fatal")
Errors matching a specific message:
monitors:
- name: Notify on "timeout" errors
type: error
notify_everyone_by_email: true
query:
- group by _group_id
- where _system in ("log:error", "log:fatal")
- where _display_name contains "timeout"
Exceptions only:
monitors:
- name: Exceptions
type: error
notify_everyone_by_email: true
query:
- group by _group_id
- where _system in ("log:error", "log:fatal")
- where exception_type exists
All environments except dev:
monitors:
- name: Notify on all errors except in "dev" environment
type: error
notify_everyone_by_email: true
query:
- group by _group_id
- group by deployment_environment
- where _system in ("log:error", "log:fatal")
- where deployment_environment != "dev"
Query attributes
| Attribute | Description |
|---|---|
_group_id | Groups similar log records together. Use group by _group_id to create one alert per unique error pattern. |
_display_name | The log message or exception message. Use contains to match a substring. |
_system | The signal type: log:error, log:fatal, log:warn, etc. |
exception_type | Set on log records that carry an exception. Use exists to filter exception-only logs. |
deployment_environment | The environment attribute, typically prod, staging, or dev. |
service_name | The service that emitted the log. |
Any indexed span attribute can be used in where and group by clauses.
Alert names
For metric monitors, Uptrace generates alert names using the monitor name and timeseries name, for example, "Disk usage: myhost+mydisk".
For error monitors, Uptrace generates alert names using the error (log) message, for example, "ERROR *fmt.wrapError: writeError failed".
You can customize alert names by specifying a Go template string as the monitor name when creating a monitor, for example, {{ .Attrs.deployment_environment_name }}: {{ .DisplayName }} will prefix the alert name with the deployment environment attribute.
You can use the following variables in templates:
| Variable | Type | Description |
|---|---|---|
{{ .DisplayName }} | string | Same as _display_name when querying spans and logs. |
{{ .Attrs }} | mapstringany | All available attributes, for example, {{ .Attrs.service_name }}. |