Go Context timeouts can be harmful

March 31, 2022

2 min read

You probably should avoid ctx.WithTimeout or ctx.WithDeadline with code that makes network calls. Here is why.

Using context for cancellation

Typically, context.Context is used to cancel operations like this:

package main

import (
    "context"
    "fmt"
    "time"
)

func main() {
    ctx := context.Background()
    ctx, cancel := context.WithTimeout(ctx, time.Second)
    defer cancel()

    select {
    case <-ctx.Done():
        fmt.Println(ctx.Err())
        fmt.Println("cancelling...")
    }
}

Later, you can use such context with, for example, Redis client:

import "github.com/go-redis/redis/v8"

rdb := redis.NewClient(...)

ctx := context.Background()
ctx, cancel := context.WithTimeout(ctx, 3*time.Second)
defer cancel()

val, err := rdb.Get(ctx, "redis-key").Result()

At first glance, the code above works fine. But what happens when rdb.Get operation exceeds the timeout?

Context deadline exceeded

When context is cancelled, go-redis and most other database clients (including database/sql) must do the following:

Close the connection, because it can't be safely reused.
Open a new connection.
Perform TLS handshake using the new connection.
Optionally, pass some authentication checks, for example, using Redis AUTH command.

Effectively, your application does not use the connection pool any more which makes each operation slower and increases the chance of exceeding the timeout again. The result can be disastrous.

Technically, this problem is not caused by context.Context and using small deadlines with net.Conn can cause similar issues. But because context.Context imposes a single timeout on all operations that use the context, each individual operation has a random timeout which depends on timings of previous operations.

What to do instead?

Your first option is to use fixed net.Conn deadlines:

var cn net.Conn
cn.SetDeadline(time.Now().Add(3 * time.Second))

With go-redis, you can use ReadTimeout and WriteTimeout options which control net.Conn deadlines:

rdb := redis.NewClient(&redis.Options{
    ReadTimeout:  3 * time.Second,
    WriteTimeout: 3 * time.Second,
})

Alternatively, you can also use a separate context timeout for each operation:

ctx := context.Background()
op1(ctx.WithTimeout(ctx, time.Second))
op2(ctx.WithTimeout(ctx, time.Second))

You should also avoid timeouts smaller than 1 second, because they have the same problem. If you must deliver a SLA no matter what, you can make sure to generate a response in time but let the operation to continue in background:

func handler(w http.ResponseWriter, req *http.Request) {
    // Process asynchronously in a goroutine.
    ch := process(req)

    select {
    case res := <-ch:
        // success
    case <-time.After(time.Second):
        // unknown result
    }
}

You may also be interested in: