Go Context timeouts can be harmful

You probably should avoid ctx.WithTimeout or ctx.WithDeadline with code that makes network calls. Here is why.

Golang context timeout

Using context for cancellation

Typically, context.Context is used to cancel operations like this:

package main

import (
	"context"
	"fmt"
	"time"
)

func main() {
	ctx := context.Background()
	ctx, cancel := context.WithTimeout(ctx, time.Second)
	defer cancel()

	select {
	case <-ctx.Done():
		fmt.Println(ctx.Err())
		fmt.Println("cancelling...")
	}
}

Later, you can use such context with, for example, Redis client:

import "github.com/go-redis/redis/v8"

rdb := redis.NewClient(...)

ctx := context.Background()
ctx, cancel := context.WithTimeout(ctx, 3*time.Second)
defer cancel()

val, err := rdb.Get(ctx, "redis-key").Result()

At first glance, the code above works fine. But what happens when rdb.Get operation exceeds the timeout?

Context deadline exceeded

When context is cancelled, go-redis and most other database clients (including database/sql) must do the following:

  1. Close the connection, because it can't be safely reused.
  2. Open a new connection.
  3. Perform TLS handshake using the new connection.
  4. Optionally, pass some authentication checks, for example, using Redis AUTH command.

Effectively, your application does not use the connection pool any more which makes each operation slower and increases the chance of exceeding the timeout again. The result can be disastrous.

Technically, this problem is not caused by context.Context and using small deadlines with net.Conn can cause similar issues. But because context.Context imposes a single timeout on all operations that use the context, each individual operation has a random timeout which depends on timings of previous operations.

What to do instead?

Your first option is to use fixed net.Conn deadlines:

var cn net.Conn
cn.SetDeadline(time.Now().Add(3 * time.Second))

With go-redisopen in new window, you can use ReadTimeout and WriteTimeout options which control net.Conn deadlines:

rdb := redis.NewClient(&redis.Options{
    ReadTimeout:  3 * time.Second,
    WriteTimeout: 3 * time.Second,
})

Alternatively, you can also use a separate context timeout for each operation:

ctx := context.Background()
op1(ctx.WithTimeout(ctx, time.Second))
op2(ctx.WithTimeout(ctx, time.Second))

You should also avoid timeouts smaller than 1 second, because they have the same problem. If you must deliver a SLA no matter what, you can make sure to generate a response in time but let the operation to continue in background:

func handler(w http.ResponseWriter, req *http.Request) {
	// Process asynchronously in a goroutine.
	ch := process(req)

	select {
	case res := <-ch:
		// success
	case <-time.After(time.Second):
		// unknown result
	}
}

Also see Context deadline exceeded.

Last Updated:

Uptrace is an open source APM and DataDog alternative that supports OpenTelemetry traces, metrics, and logs. You can use it to monitor apps and set up alerts to receive notifications via email, Slack, Telegram, and more.

Uptrace can process billions of spans on a single server, allowing you to monitor your software at 10x less cost. View on GitHub →

Uptrace Demo