Querying Traces

While the span query language operates on individual spans and logs, trace queries allow you to select entire traces based on the spans, logs, and events they contain. This is useful for finding traces that match complex criteria across multiple services and operations.

For example, you can find all traces that contain a slow database query, have an error in a specific service, or involve a particular combination of microservices.

Querying Traces UI

Query Structure

A trace query consists of multiple rows, where each row targets a different part of the trace. The first row is always the root row, and subsequent rows define conditions on child spans, logs, or events.

text

root: <query for the root span>
<system> as <alias>: <query for matching spans>

Each row accepts the full span query language including filters, aggregations, and groupings. A typical root row query looks like:

text

root: perMin(count()) | quantiles(_dur_ms) | group by _group_id

Component	Description
`root`	Selects the root span of the trace. Use `<empty>` to match all root spans.
`<system>`	A span system like `db:postgresql`, `rpc:all`, or `log:error`.
`as <alias>`	An optional alias for readability.
`<query>`	Full query with where, group by, and aggregate clauses.

Filtering by System

Each non-root row targets spans belonging to a specific system. You can use exact systems or wildcard systems with :all:

System	Matches
`db:postgresql`	PostgreSQL database spans only
`db:mysql`	MySQL database spans only
`db:all`	All database spans (PostgreSQL, MySQL, etc)
`rpc:grpc`	gRPC spans only
`rpc:all`	All RPC spans
`http:service1`	HTTP spans for `service1`
`httpserver:all`	All HTTP server spans
`messaging:kafka`	Kafka messaging spans only
`messaging:all`	All messaging spans
`log:error`	Error-level logs
`log:warn`	Warning-level logs

Examples

Traces with the Number of Database Queries

Return traces along with how many database queries each trace contains:

text

root: perMin(count()) | quantiles(_dur_ms) | _error_rate | group by _group_id
db:all as db: count()

Traces with Multiple Database Queries

Find traces that contain at least 2 PostgreSQL queries:

text

root: perMin(count()) | quantiles(_dur_ms) | _error_rate | group by _group_id
db:postgresql as pg: having count() >= 2

This is useful for identifying N+1 query problems or traces with excessive database calls.

Traces with Slow Service Calls

Find traces where a specific service takes longer than 1 second to respond:

text

root: perMin(count()) | quantiles(_dur_ms) | _error_rate | group by _group_id
rpc:all as rpc: where service_name = "foo" | where _kind = "server" | where _dur_ms >= 1000

Traces with Error Logs

Find traces that contain error logs mentioning "timeout":

text

root: perMin(count()) | quantiles(_dur_ms) | _error_rate | group by _group_id
log:error as err: _display_name contains "timeout"

Traces with Errors in a Specific Service

Find traces where a particular service returned an error:

text

root: perMin(count()) | quantiles(_dur_ms) | _error_rate | group by _group_id
rpc:all as rpc: where service_name = "payment-service" | where _status_code = "error"

Traces with Slow Root Spans

Find traces where the root span exceeds 5 seconds:

text

root: where _dur_ms >= 5000

Combining Multiple Conditions

Find traces that are slow and contain database errors:

text

root: where _dur_ms >= 3000
db:all as db: where _status_code = "error"

This selects traces where the overall duration exceeds 3 seconds and at least one database span has an error status.

Available Clauses

Each row — both root and child — supports the full span query language including filters, aggregations, and groupings:

Clause	Example	Description
`where`	`where service_name = "foo"`	Filter spans by attribute values
`where`	`where _dur_ms >= 1000`	Filter by span duration
`where`	`where _status_code = "error"`	Filter by status code
`where`	`where _kind = "server"`	Filter by span kind
`where`	`where _display_name contains "text"`	Filter by display name
`group by`	`group by _group_id`	Group spans by attribute
`group by`	`group by service_name`	Group by service
Aggregates	`count()`, `perMin(count())`	Count and rate aggregations
Aggregates	`p50(_dur_ms)`, `quantiles(_dur_ms)`	Duration percentiles
`having`	`having count() >= 2`	Filter by aggregated span count
`having`	`having p99(_dur_ms) >= 500`	Filter by percentile duration

Multiple clauses can be chained with the pipe | operator:

text

rpc:all as rpc: where service_name = "foo" | where _kind = "server" | where _dur_ms >= 1000