Querying Spans and Logs

Uptrace provides a powerful querying language that enables you to filter, group, and aggregate your observability data effectively. The query language supports filters (where _status_code = "error"), grouping (group by _group_id), and aggregates (p50(_dur_ms)).

Prerequisites for Effective Querying

To write useful and performant queries, you need to pre-process raw data to ensure it has a well-defined structure. You can achieve this by:

Recording contextual information in span attributes and events
Using structured logging for logs

For searching over spans and logs (as opposed to complex querying), Uptrace supports a more concise syntax described in the searching documentation.

Query Language Components

Identifiers

Identifiers are unquoted strings that reference span fields, attributes, and extract values from JSON. Examples include _name, display_name, and _dur_ms.

Important: Span fields start with an underscore to distinguish them from attributes.

Built-in Span Fields

The following table lists all built-in span fields available for querying:

Span Field	Description
`_id`	Unique span identifier
`_parent_id`	Parent span identifier
`_trace_id`	Trace identifier
`_name`	Span operation name
`_event_name`	Event name (for span events)
`_kind`	Span kind (client, server, etc.)
`_dur_ms`	Span duration in milliseconds
`_time`	Span start time
`_status_code`	Span status code (ok, error, etc.)
`_status_message`	Span status message

Attribute Names

Attribute names are normalized by replacing dots with underscores. Common examples include:

Attribute Name	Description
`display_name`	display.name attribute
`service_name`	OpenTelemetry `service.name` attribute

Data Types

Uptrace supports the following attribute types, each with specific comparison operators:

Attribute Type	Supported Comparison Operators
`str`	`=`, `in`, `like`, `contains`, `~` (regexp), `exists`
`int` and `float`	`=`, `<`, `<=`, `>`, `>=`, `exists`
`bool`	`=`, `!=`
`[]str`	`contains`, `exists`

Type Specification

While type discovery is automatic, you can explicitly specify types when needed:

sql

foo::string | bar::int | baz::float

For attributes with mixed types across spans (not recommended), you can query multiple types:

sql

foo::string | foo::int

Note: Uptrace uses type information as a hint for reading columnar data, not for type conversion.

Units

Specify attribute units by adding the unit name as a suffix to improve data interpretation and readability:

Attribute	Unit
`http_read_bytes`	`bytes`
`request_dur_ms_seconds`	`seconds`
`elapsed_milliseconds`	`milliseconds`
`memory_utilization`	`utilization`

If you cannot modify the attribute name, use aliases in queries:

sql

sum(heap_size) as heap_size_bytes

String Literals

Strings can be defined using single quotes, double quotes, or backticks:

sql

"I'm a string\n"           -- Double quotes with escape sequences
'I\'m a string\n'          -- Single quotes with escape sequences
`^some-prefix-(\w+)$`      -- Backticks (no escape sequences, useful for regex)

Query Clauses

Filtering with WHERE

Use where clauses to filter spans and events by their attributes. The following examples demonstrate common filtering patterns:

Filter Example	Description
`where _status_code = "error"`	Filter spans with error status (case-sensitive)
`where display_name like "hello%"`	Filter span names starting with "hello" (case-insensitive)
`where display_name like "%hello"`	Filter span names ending with "hello" (case-insensitive)
`where display_name contains "hello"`	Filter span names containing "hello" (case-insensitive)
`where display_name contains "foo\|bar"`	Same as `display_name contains "foo" OR display_name contains "bar"`
`where _dur_ms > 1ms`	Duration greater than 1 millisecond (equivalent to `_dur_ms > 1000`)
`where http_request_content_length > 1kb`	Content length greater than 1 kilobyte (equivalent to `http_request_content_length > 1024`)
`where foo exists`	Filter spans that have the `foo` attribute

Supported Units in Filters

Time units: μs (microseconds), ms (milliseconds), s (seconds)
Size units: kb (kilobytes), mb (megabytes), gb (gigabytes), tb (terabytes)

Filtering Results with HAVING

Use having to filter aggregated query results after grouping:

sql

group by service_name | having p50(_dur_ms) > 100ms

Searching

You can use search to filter query results using a simpler syntax:

sql

search foo|bar -hello

This approach is more concise than complex where clauses for basic text matching.

Grouping with GROUP BY

Group spans using group by clauses, similar to SQL. The following examples show common grouping patterns:

Example	Description
`group by _group_id`	Groups similar spans together
`group by host_name`	Groups spans by the `host_name` attribute
`group by service_name, service_version`	Groups spans by combination of `service_name` and `service_version`
`group by lower(attribute)`	Groups by lowercase value of `attribute`
extract(host_name, `^server-(\w+)$`) as host	Extracts and groups by a fragment from `host_name`

You can use any transformation function in the group by clause to modify values before grouping.

Functions

Aggregate Functions

Aggregate functions perform calculations on sets of values and return single values. They are commonly used with grouping to summarize data.

Function	Example	Description
`count`	`count()`	Number of matched spans/logs/events
`any`	`any(_name)`	Any (random) span name
`anyLast`	`anyLast(_name)`	Any last span name
`avg`	`avg(_dur_ms)`	Average span duration
`min`, `max`	`max(_dur_ms)`	Minimum/maximum span duration
`sum`	`sum(http_request_content_length)`	Total sum of values
`p50`, `p75`, `p90`, `p99`	`p50(_dur_ms)`	Span duration percentiles
`top3`, `top10`	`top3(code_function)`	Top N most popular values
`uniq`	`uniq(http_client_ip)`	Number of unique values
`apdex`	`apdex(500ms, 3s)`	Apdex score

Conditional Aggregates

Uptrace supports ClickHouse if combinator on aggregate functions for conditional calculations:

countIf(_status_code = "error") - Count spans with error status
p50If(_dur_ms, service_name = "service1") - P50 duration for specific service

Virtual Columns

Uptrace provides shortcuts for common aggregations to simplify queries:

Virtual Column	Equivalent Expression
`_error_rate`	`countIf(_status_code = "error") / count()`

Transform Functions

Transform functions accept a value and return a new value for each matched span/log/event. These functions are useful for data manipulation and formatting.

Function	Example	Description
`lower`	`lower(log_severity)`	Converts string to lowercase
`upper`	`upper(log_severity)`	Converts string to uppercase
`perMin`	`perMin(count())`	Divides value by the number of minutes in the time interval
`perSec`	`perSec(count())`	Divides value by the number of seconds in the time interval
`trimPrefix(str, prefix)`	`trimPrefix(str, "prefix")`	Removes the specified leading prefix string
`trimSuffix(str, suffix)`	`trimSuffix(str, "suffix")`	Removes the specified trailing suffix string
`extract(haystack, pattern)`	`extract(host_name,` ^uptrace-prod-(\w+)$`)`	Extracts a fragment using regular expression
`replace(haystack, substring, replacement)`	`replace(host_name, 'uptrace-prod-', '')`	Replaces all occurrences of substring
`replaceRegexp(haystack, pattern, replacement)`	replaceRegexp(host, `^`, 'prefix ')	Replaces all occurrences matching regular expression pattern
`arrayJoin`	`arrayJoin(db_sql_tables)`	See ClickHouse arrayJoin
`parseInt64`	`parseInt64(str_with_int)`	Parses string as 64-bit integer
`parseFloat64`	`parseFloat64(str_with_float)`	Parses string as 64-bit float
`parseDateTime`	`parseDateTime(str_with_time)`	Parses string as date with time

Time Rounding Functions

Time rounding functions are particularly useful for time-series analysis and creating time-based aggregations:

Function	Description
`toStartOfDay`	Rounds down to start of day
`toStartOfHour`	Rounds down to start of hour
`toStartOfMinute`	Rounds down to start of minute
`toStartOfSecond`	Rounds down to start of second
`toStartOfFiveMinutes`	Rounds down to start of 5-minute interval
`toStartOfTenMinutes`	Rounds down to start of 10-minute interval
`toStartOfFifteenMinutes`	Rounds down to start of 15-minute interval

Complete Query Examples

You can combine filters, grouping, and aggregates to create powerful queries for comprehensive observability analysis.

Daily Unique Visitors (Excluding Bots)

sql

where user_agent_is_bot not exists | uniq(client_address) | group by toStartOfDay(_time)

This query identifies unique visitors per day while excluding bot traffic.

Error Rate by Service

sql

group by service_name | having count() > 100 | select service_name, _error_rate

This query calculates error rates for services with significant traffic (more than 100 requests).

Top 10 Slowest Operations

sql

where _dur_ms > 100ms | group by _name | top10(_name)

This query identifies the slowest operations based on 99th percentile duration.

Query Performance Optimization

If your queries are taking too long to complete, apply these optimization strategies to improve performance:

Narrow the Time Range

Select shorter time periods when possible (e.g., "Last 1 hour" instead of "Last 24 hours")
Use specific time ranges rather than open-ended queries
Consider the data volume when selecting time ranges

Use System Filters

Select specific systems when analyzing particular types of requests
Example: Select httpserver:all system for HTTP request analysis
This reduces the data scope significantly

Add Group Filters

Further narrow scope with _group_id filters
Example: where _group_id = 123456789
Group filters are highly efficient for targeted analysis

Leverage Indexed Attributes

Use OpenTelemetry semantic conventions for attribute names
Uptrace optimizes certain attributes better than others
Indexed attributes provide faster query performance

Query Structure Best Practices

Apply filters before grouping when possible to reduce data volume
Use specific filters rather than broad pattern matching
Limit the number of groups in group by clauses
Consider using having clauses to filter aggregated results

Common Query Patterns

Error Analysis

sql

-- Find error patterns by service
where _status_code = "error" | group by service_name, _name |
select service_name, _name, count(), p50(_dur_ms)

Performance Monitoring

sql

-- Monitor slow requests across services
where _dur_ms > 1s | group by service_name |
select service_name, count(), p95(_dur_ms), p99(_dur_ms)

Traffic Analysis

sql

-- Analyze request volume over time
group by toStartOfHour(_time), service_name |
select _time, service_name, count() as request_count