Grouping rules

Grouping rules let you change how Uptrace groups logs and exceptions together. Each rule is a Grok-style pattern: literal words plus typed placeholders that extract variable parts (numbers, IPs, identifiers, etc.) and feed them into the grouping fingerprint.

For example, you can configure Uptrace to create a separate error group for each unknown PostgreSQL column:

text
# Error messages
ERROR: column "event.created_at" does not exist (SQLSTATE=42703)
ERROR: column "updated_at" does not exist (SQLSTATE=42703)
ERROR: column "name" does not exist (SQLSTATE=42703)

# Pattern
%{LOG_LEVEL:log_severity} column %{QUOTED:#column} does not exist %{ATTR:sqlstate}

Patterns

A pattern is a sequence of matchers separated by whitespace. Each matcher is either a literal word or a typed placeholder.

Literals

Plain words match tokens exactly:

text
error connecting to database

Typed placeholders

Typed placeholders use Grok syntax %{TYPE} to match variable tokens by type:

text
%{NUMBER}
%{IP}
%{LOG_LEVEL}

Capture name

Add a capture name after a colon to extract the matched value into an attribute:

text
%{NUMBER:status_code}
%{IP:remote_addr}

Prefix the capture name with # to also include the matched value in the grouping fingerprint hash. This is how you create a separate group per unique value:

text
%{IDENT:#function_name}

Without a capture name, use the fingerprint option instead:

text
%{IDENT,fingerprint}

Primary constraint

The parenthesized argument %{TYPE(arg)} constrains which tokens match. Its meaning depends on the type:

  • Value constraint (most types): match only tokens with this exact value.
    text
    %{LOG_LEVEL(ERROR)}
    %{LOG_LEVEL(ERROR):level}
    
  • Key constraint (ATTR): match only key=value tokens whose key equals the argument. The key auto-captures, so %{ATTR(status)} is equivalent to %{ATTR(status):status}. Use _ to suppress auto-capture.
    text
    %{ATTR(status)}
    %{ATTR(user_id):var}
    %{ATTR(status):_}
    

Options

Options are comma-separated key=value pairs after the capture name:

text
%{NUMBER:duration,unit=seconds}
OptionDescription
unitAttach a unit to numeric values for normalization
fingerprintInclude the matched value in the grouping fingerprint hash (no value)
extractBacktick-quoted Go regex applied to a %{QUOTED} token; named groups become captures

Supported units (used with unit=...):

  • Time: nanoseconds, microseconds (us), milliseconds (ms), seconds (s)
  • Storage: bytes (by), kilobytes (kb), megabytes (mb), gigabytes (gb), terabytes (tb)
  • Other: percents (%), count, celsius, volts, amperes, joules, grams

Regex extraction (extract option)

On %{QUOTED} matchers you can supply a Go regex in backticks. Named groups in the regex become additional captures at runtime:

text
ERROR %{QUOTED:msg,extract=`(?P<name>\w+) is (?P<age>\d+)`}

Given the log line ERROR "Alice is 25", this rule captures msg=Alice is 25, name=Alice, and age=25.

If the regex does not match the quoted token, the rule still fires and the outer :capture is still recorded, but no extra attributes are emitted. Optional named groups that do not participate in a match are not emitted as empty attributes. If a named group collides with the matcher's own :capture or another attribute key, a numeric suffix is appended (e.g., the second value becomes name1).

Optional matchers

Append ? to make a matcher optional — the pattern still matches if the token is absent:

text
error code %{NUMBER:code}? occurred

This matches both error code 500 occurred and error code occurred.

Groups and alternatives

Parentheses define a group of alternatives separated by |:

text
(%{LOG_LEVEL:level}|%{WORD:level}) %{WORD:msg}

Groups themselves can be optional:

text
(%{LOG_LEVEL})?

Repeat matchers (%{ANY}+ / %{ANY}*)

Append + or * to %{ANY} to match multiple tokens of any type:

  • %{ANY}+ matches one or more tokens.
  • %{ANY}* matches zero or more tokens.

Repeat matchers work anywhere in a pattern:

text
error %{WORD:action} %{ANY:details}+ matches "error connect failed with timeout"
foo %{ANY:mid}+ bar matches "foo x y z bar"
%{IDENT:function} failed %{ANY}+ matches "myFunc failed with status 500"

Repeat is only valid on %{ANY}. It cannot be combined with a value constraint or a unit option.

Available types

Some types are virtual — they expand to several concrete types.

VirtualExpands to
ANYAny single token, regardless of type
NUMBERINT, FLOAT, BYTE_SIZE, TRACE_ID_HEX
IPIPV4, IPV6
IDENTWORD, IDENT
TIMESTAMPISO8601_DATE, UNIX_DATE, HTTP_DATE, SYSLOG_DATE, DATETIME

Text

TypeDescriptionExample
WORDA single alphabetical worderror, database
IDENTAn identifieruser_id, MyClass, obj.attr
QUOTEDA quoted string"hello world", 'foo'
UNKNOWNAn unclassified segment (lexer fallback when no other type matches)??!!, \x1b[0m
ANYAny single token. With +: one or more tokens. With *: zero or more tokens.42, GET, foo

Numeric

TypeDescriptionExample
NUMBERAny numeric (INT, FLOAT, BYTE_SIZE, TRACE_ID_HEX)42, 3.14, 10KB
INTInteger200, -17
FLOATFloating-point number3.14, -0.5, 1e9
BYTE_SIZEByte size with unit10KB, 2.5MiB, 512B
TRACE_ID_HEX32-character hex string5d41402abc4b2a76b9719d911017c592 (MD5)

Network

TypeDescriptionExample
IPIPV4 or IPV6 (virtual)127.0.0.1, ::1
IPV4IPV4 address192.168.1.1
IPV6IPV6 address2001:db8::1
HOST_PORThost:port combination10.0.0.1:443
MACMAC address00:1A:2B:3C:4D:5E
EMAILEmail addressadmin@example.com
URIFull URIhttps://example.com/api/v1

Temporal

TypeDescriptionExample
TIMESTAMPAny timestamp format (virtual)2024-01-15T14:30:00Z
ISO8601_DATEISO8601 / RFC33392024-01-15T14:30:00Z
UNIX_DATEUnix dateMon Jan 2 15:04:05 MST 2006
HTTP_DATEHTTP log date21/Nov/2024:14:20:00 +0000
SYSLOG_DATESyslog timestampJan 2 15:04:05
DATETIMEDate and time2024-01-15 14:30:00
DATEDate only2024-01-15
TIMETime of day14:30:00
MONTH_NAMEMonth nameJan, February
WEEKDAYDay-of-week nameMon, Tuesday

Structured

TypeDescriptionExample
JSONJSON object or array{"foo": "bar"}, [1, 2, 3]
ATTRkey=value attributestatus=200, user_id=42

System

TypeDescriptionExample
LOG_LEVELLog severity levelINFO, WARN, ERROR
HTTP_METHODHTTP methodGET, POST, DELETE
HTTP_VERSIONHTTP protocol versionHTTP/1.1, HTTP/2.0
HTTP_STATUSHTTP status code with reason phrase200 OK, 404 Not Found
URI_PATHFile or URL path/api/users, /var/log/syslog
UUIDUUID string88da75f6-a07e-40b3-8c62-f2b28c505ff2
HASHTAGHashtag#deploy, #uptrace
HTML_TAGHTML tag<html>, </div>, <br/>

Type aliases

For compatibility with existing Grok corpora, several types have alternative names:

AliasCanonical
STRINGQUOTED
NUMNUMBER
INTEGERINT
TIMESTAMP_ISO8601ISO8601_DATE
SYSLOGTIMESTAMPSYSLOG_DATE

Fingerprints

Uptrace groups similar logs and exceptions by hashing certain parts of the message. By default it only hashes literal words; use # (or the fingerprint option) to include captured values in the hash.

For example:

text
unknown column: %{WORD:#column}

The pattern above creates a separate group for each column, which is useful for alerting:

text
# Group 1
unknown column: foo
unknown column: foo

# Group 2
unknown column: bar
unknown column: bar

You can also set the grouping.fingerprint attribute when creating logs and exceptions, which overrides the automatically derived fingerprint:

go
span := trace.SpanFromContext(ctx)

span.AddEvent("exception", trace.WithAttributes(
    attribute.String("exception.type", "*exec.ExitError"),
    attribute.String("exception.message", "exit status 1"),
    attribute.String("grouping.fingerprint", "exec.ExitError"),
))

Examples

Go-style error messages:

text
# Messages
strconv.ParseInt failed
SendEmail failed
mypkg.MyFunc failed

# Pattern
%{IDENT:#code_function} failed

PostgreSQL unknown column errors:

text
# Error messages
ERROR: column "event.created_at" does not exist (SQLSTATE=42703)
ERROR: column "updated_at" does not exist (SQLSTATE=42703)
ERROR: column "name" does not exist (SQLSTATE=42703)

# Pattern
%{LOG_LEVEL:log_severity} column %{QUOTED:#column} does not exist %{ATTR:sqlstate}

A single grouping rule may declare multiple patterns — any of them matching is enough:

text
can't find item %{NUMBER:item_id}
can not find item %{NUMBER:item_id}
%{NUMBER:item_id} not found

Conclusion

Grouping rules work best with structured logs and are not a replacement for the log parsers provided by OpenTelemetry Logs and Vector.