Free and Open Source Distributed Tracing Tools
Tracing tools help you manage, monitor, and assess performance of your cloud infrastructure, services, and applications, and make sure your customers get the best digital experience.
The best tracing tools can help you eliminate performance bottlenecks and recover from incidents faster. Use this guide to pick the right one for you.
What is a distributed tracing tool?
Distributed tracing tools allow you to see how a request progresses through different services and systems, timings of each operation, any logs and errors as they occur.
In a distributed environment, tracing tools also help you understand relationships and interactions between microservices. Tracing tools gives an insight into how a particular microservice is performing and how that service affects other microservices.
Why do you need a distributed tracing tool?
Get a centralized view. Tracing provides a single view of your distributed microservices. Your team can more easily understand how an application is built and how services interact with each other.
Fix bottlenecks faster. By having all the service's performance in hand, you can quickly pinpoint failures and identify performance bottlenecks.
Recover from incidents faster. A gread tracing tool can notify you when the site is down or a performance anomaly is detected.
Open source tracing tools
Uptrace
Uptrace is an OpenTelemetry tracing tool that monitors performance, errors, and logs. Main features include an intuitive query builder, rich dashboards, percentiles, users and projects management.
Tech stack:
- Backend: Go
- Frontend: Vue.js
- Instrumentation: OpenTelemetry / OTLP
- Storage: ClickHouse with S3
Pros:
- Supports both tracing and metrics
- Rich UI with charts
- Advanced filtering capabilities
- Simple setup with ClickHouse being the only dependency
- OpenTelemetry support including pre-configured distros
Cons:
- ClickHouse is the only supported DBMS
SigNoz
SigNoz is an open-source APM. It helps developers monitor their applications & troubleshoot problems.
SigNoz provides a unified UI for metrics and traces so that there is no need to switch between different tools such as Jaeger and Prometheus.
Tech stack:
- Backend: Go
- Frontend: React
- Instrumentation: OpenTelemetry / OTLP
- Storage: ClickHouse
Pros:
- Native OpenTelemetry support
- Rich UI with charts
- Metrics support using Prometheus as a backend and custom UI
- Traces visualization using Flamegraphs and Gantt charts
- Filters based on tags, status codes, service names, operation, etc.
- Alarms
Jaeger
Jaeger is a distributed tracing platform created by Uber Technologies. It can be used for monitoring microservices-based distributed systems.
Tech stack:
- Backend: Go
- Frontend: React
- Instrumentation: OpenTelemetry / OTLP
- Storage: Cassandra, Elasticsearch; ClickHouse using a plugin
Pros:
- Stable and well-known project
- Adaptive sampling
- Support for multiple DBMS via plugins
- Sponsored by CNCF
Cons:
- No charts / percentiles
- Limited filtering capabilities
- Not all plugins are maintained and usable
Sentry
Sentry tracks your software performance, measures metrics like throughput and latency, and displays the impact of errors across multiple systems.
Tech stack:
- Backend: Python
- Frontend: React
- Instrumentation: Sentry SDK
- Storage: Kafka, Redis, PostgreSQL, ClickHouse
Pros:
- Excellent errors monitoring
- Quality SDK for Go, Python, Ruby, .NET, and PHP
- Friendly UI
Cons:
- Complex setup
- No OpenTelemetry support
- The UI is built around errors monitoring
SkyWalking
SkyWalking is an open source APM system, including monitoring, tracing, diagnosing capabilities for distributed system in Cloud Native architecture.
Tech stack:
- Backend: Java
- Frontend: Vue.js
- Instrumentation: SkyWalking
- Storage: ElasticSearch, MySQL, TiDB, InfluxDB, and more
Pros:
- Rich UI with charts
- Good metrics support (including dashboards)
- Alarms
- Support for multiple DBMS
Cons:
- Complex setup
- Complex and overloaded UI
- Confusing tracing UI
- OpenTelemetry support requires OpenTelemetry Collector
Zipkin
Zipkin is a distributed tracing system. It helps gather timing data needed to troubleshoot latency problems in service architectures. Features include both the collection and lookup of this data.
Zipkin's UI is minimalistic, but you can replace it with Grafana/Kibana configured to work with Zipkin data source.
Tech stack:
- Backend: Java
- Frontend: React
- Instrumentation: Zipkin span model; OpenTelemetry via adapter
- Storage: MySQL, Cassandra, or Elasticsearch.
Pros:
- Stable and well-known project
- Support for multiple DBMS
Cons:
- No active development
- Limited UI and filtering capabilities
- OpenTelemetry support requires an adapter
- No ClickHouse support
Grafana Tempo
Grafana Tempo is an open source, easy-to-use, and high-scale distributed tracing backend. Tempo is cost-efficient, requiring only object storage to operate, and is deeply integrated with Grafana, Prometheus, and Loki. Tempo can ingest common open source tracing protocols, including Jaeger, Zipkin, and OpenTelemetry.
Tech stack:
- Backend: Go
- Frontend: React
- Instrumentation: OpenTelemetry / OTLP
- Storage: Grafana Tempo
Pros:
- Integration with Grafana metrics dashboard
- OpenTelemetry support
Cons:
- The UI is built around metrics and feels awkward / clumsy for everything else
- Limited filtering capabilities
Paid cloud tracing tools
If you looking for a paid tracing tool in the cloud, see our guide for DataDog competitors and alternatives.