Querying Spans and Logs

Uptrace provides a powerful querying language that enables you to filter, group, and aggregate your observability data effectively. The query language supports filters (where _status_code = "error"), grouping (group by _group_id), and aggregates (p50(_dur_ms)).

Prerequisites for Effective Querying

To write useful and performant queries, you need to pre-process raw data to ensure it has a well-defined structure. You can achieve this by:

For searching over spans and logs (as opposed to complex querying), Uptrace supports a more concise syntax described in the searching documentation.

Query Language Components

Identifiers

Identifiers are unquoted strings that reference span fields, attributes, and extract values from JSON. Examples include _name, display_name, and _dur_ms.

Important: Span fields start with an underscore to distinguish them from attributes.

Built-in Span Fields

The following table lists all built-in span fields available for querying:

Span FieldDescription
_idUnique span identifier
_parent_idParent span identifier
_trace_idTrace identifier
_nameSpan operation name
_event_nameEvent name (for span events)
_kindSpan kind (client, server, etc.)
_dur_msSpan duration in milliseconds
_timeSpan start time
_status_codeSpan status code (ok, error, etc.)
_status_messageSpan status message

Attribute Names

Attribute names are normalized by replacing dots with underscores. Common examples include:

Attribute NameDescription
display_namedisplay.name attribute
service_nameOpenTelemetry service.name attribute

Data Types

Uptrace supports the following attribute types, each with specific comparison operators:

Attribute TypeSupported Comparison Operators
str=, in, like, contains, ~ (regexp), exists
int and float=, <, <=, >, >=, exists
bool=, !=
[]strcontains, exists

Type Specification

While type discovery is automatic, you can explicitly specify types when needed:

sql
foo::string | bar::int | baz::float

For attributes with mixed types across spans (not recommended), you can query multiple types:

sql
foo::string | foo::int

Note: Uptrace uses type information as a hint for reading columnar data, not for type conversion.

Units

Specify attribute units by adding the unit name as a suffix to improve data interpretation and readability:

AttributeUnit
http_read_bytesbytes
request_dur_ms_secondsseconds
elapsed_millisecondsmilliseconds
memory_utilizationutilization

If you cannot modify the attribute name, use aliases in queries:

sql
sum(heap_size) as heap_size_bytes

String Literals

Strings can be defined using single quotes, double quotes, or backticks:

sql
"I'm a string\n"           -- Double quotes with escape sequences
'I\'m a string\n'          -- Single quotes with escape sequences
`^some-prefix-(\w+)$`      -- Backticks (no escape sequences, useful for regex)

Query Clauses

Filtering with WHERE

Use where clauses to filter spans and events by their attributes. The following examples demonstrate common filtering patterns:

Filter ExampleDescription
where _status_code = "error"Filter spans with error status (case-sensitive)
where display_name like "hello%"Filter span names starting with "hello" (case-insensitive)
where display_name like "%hello"Filter span names ending with "hello" (case-insensitive)
where display_name contains "hello"Filter span names containing "hello" (case-insensitive)
where display_name contains "foo|bar"Same as display_name contains "foo" OR display_name contains "bar"
where _dur_ms > 1msDuration greater than 1 millisecond (equivalent to _dur_ms > 1000)
where http_request_content_length > 1kbContent length greater than 1 kilobyte (equivalent to http_request_content_length > 1024)
where foo existsFilter spans that have the foo attribute

Supported Units in Filters

  • Time units: μs (microseconds), ms (milliseconds), s (seconds)
  • Size units: kb (kilobytes), mb (megabytes), gb (gigabytes), tb (terabytes)

Filtering Results with HAVING

Use having to filter aggregated query results after grouping:

sql
group by service_name | having p50(_dur_ms) > 100ms

Searching

You can use search to filter query results using a simpler syntax:

sql
search foo|bar -hello

This approach is more concise than complex where clauses for basic text matching.

Grouping with GROUP BY

Group spans using group by clauses, similar to SQL. The following examples show common grouping patterns:

ExampleDescription
group by _group_idGroups similar spans together
group by host_nameGroups spans by the host_name attribute
group by service_name, service_versionGroups spans by combination of service_name and service_version
group by lower(attribute)Groups by lowercase value of attribute
extract(host_name, `^server-(\w+)$`) as hostExtracts and groups by a fragment from host_name

You can use any transformation function in the group by clause to modify values before grouping.

Functions

Aggregate Functions

Aggregate functions perform calculations on sets of values and return single values. They are commonly used with grouping to summarize data.

FunctionExampleDescription
countcount()Number of matched spans/logs/events
anyany(_name)Any (random) span name
anyLastanyLast(_name)Any last span name
avgavg(_dur_ms)Average span duration
min, maxmax(_dur_ms)Minimum/maximum span duration
sumsum(http_request_content_length)Total sum of values
p50, p75, p90, p99p50(_dur_ms)Span duration percentiles
top3, top10top3(code_function)Top N most popular values
uniquniq(http_client_ip)Number of unique values
apdexapdex(500ms, 3s)Apdex score

Conditional Aggregates

Uptrace supports ClickHouse if combinator on aggregate functions for conditional calculations:

  • countIf(_status_code = "error") - Count spans with error status
  • p50If(_dur_ms, service_name = "service1") - P50 duration for specific service

Virtual Columns

Uptrace provides shortcuts for common aggregations to simplify queries:

Virtual ColumnEquivalent Expression
_error_ratecountIf(_status_code = "error") / count()

Transform Functions

Transform functions accept a value and return a new value for each matched span/log/event. These functions are useful for data manipulation and formatting.

FunctionExampleDescription
lowerlower(log_severity)Converts string to lowercase
upperupper(log_severity)Converts string to uppercase
perMinperMin(count())Divides value by the number of minutes in the time interval
perSecperSec(count())Divides value by the number of seconds in the time interval
trimPrefix(str, prefix)trimPrefix(str, "prefix")Removes the specified leading prefix string
trimSuffix(str, suffix)trimSuffix(str, "suffix")Removes the specified trailing suffix string
extract(haystack, pattern)extract(host_name, ^uptrace-prod-(\w+)$)Extracts a fragment using regular expression
replace(haystack, substring, replacement)replace(host_name, 'uptrace-prod-', '')Replaces all occurrences of substring
replaceRegexp(haystack, pattern, replacement)replaceRegexp(host, `^`, 'prefix ')Replaces all occurrences matching regular expression pattern
arrayJoinarrayJoin(db_sql_tables)See ClickHouse arrayJoin
parseInt64parseInt64(str_with_int)Parses string as 64-bit integer
parseFloat64parseFloat64(str_with_float)Parses string as 64-bit float
parseDateTimeparseDateTime(str_with_time)Parses string as date with time

Time Rounding Functions

Time rounding functions are particularly useful for time-series analysis and creating time-based aggregations:

FunctionDescription
toStartOfDayRounds down to start of day
toStartOfHourRounds down to start of hour
toStartOfMinuteRounds down to start of minute
toStartOfSecondRounds down to start of second
toStartOfFiveMinutesRounds down to start of 5-minute interval
toStartOfTenMinutesRounds down to start of 10-minute interval
toStartOfFifteenMinutesRounds down to start of 15-minute interval

Complete Query Examples

You can combine filters, grouping, and aggregates to create powerful queries for comprehensive observability analysis.

Daily Unique Visitors (Excluding Bots)

sql
where user_agent_is_bot not exists | uniq(client_address) | group by toStartOfDay(_time)

This query identifies unique visitors per day while excluding bot traffic.

Error Rate by Service

sql
group by service_name | having count() > 100 | select service_name, _error_rate

This query calculates error rates for services with significant traffic (more than 100 requests).

Top 10 Slowest Operations

sql
where _dur_ms > 100ms | group by _name | top10(_name)

This query identifies the slowest operations based on 99th percentile duration.

Query Performance Optimization

If your queries are taking too long to complete, apply these optimization strategies to improve performance:

Narrow the Time Range

  • Select shorter time periods when possible (e.g., "Last 1 hour" instead of "Last 24 hours")
  • Use specific time ranges rather than open-ended queries
  • Consider the data volume when selecting time ranges

Use System Filters

  • Select specific systems when analyzing particular types of requests
  • Example: Select httpserver:all system for HTTP request analysis
  • This reduces the data scope significantly

Add Group Filters

  • Further narrow scope with _group_id filters
  • Example: where _group_id = 123456789
  • Group filters are highly efficient for targeted analysis

Leverage Indexed Attributes

Query Structure Best Practices

  • Apply filters before grouping when possible to reduce data volume
  • Use specific filters rather than broad pattern matching
  • Limit the number of groups in group by clauses
  • Consider using having clauses to filter aggregated results

Common Query Patterns

Error Analysis

sql
-- Find error patterns by service
where _status_code = "error" | group by service_name, _name |
select service_name, _name, count(), p50(_dur_ms)

Performance Monitoring

sql
-- Monitor slow requests across services
where _dur_ms > 1s | group by service_name |
select service_name, count(), p95(_dur_ms), p99(_dur_ms)

Traffic Analysis

sql
-- Analyze request volume over time
group by toStartOfHour(_time), service_name |
select _time, service_name, count() as request_count

See Also