Go Context timeouts can be harmful
You probably should avoid ctx.WithTimeout
or ctx.WithDeadline
with code that makes network calls. Here is why.
Using context for cancellation
Typically, context.Context
is used to cancel operations like this:
package main
import (
"context"
"fmt"
"time"
)
func main() {
ctx := context.Background()
ctx, cancel := context.WithTimeout(ctx, time.Second)
defer cancel()
select {
case <-ctx.Done():
fmt.Println(ctx.Err())
fmt.Println("cancelling...")
}
}
Later, you can use such context with, for example, Redis client:
import "github.com/go-redis/redis/v8"
rdb := redis.NewClient(...)
ctx := context.Background()
ctx, cancel := context.WithTimeout(ctx, 3*time.Second)
defer cancel()
val, err := rdb.Get(ctx, "redis-key").Result()
At first glance, the code above works fine. But what happens when rdb.Get
operation exceeds the timeout?
Context deadline exceeded
When context is cancelled, go-redis and most other database clients (including database/sql) must do the following:
- Close the connection, because it can't be safely reused.
- Open a new connection.
- Perform TLS handshake using the new connection.
- Optionally, pass some authentication checks, for example, using Redis
AUTH
command.
Effectively, your application does not use the connection pool any more which makes each operation slower and increases the chance of exceeding the timeout again. The result can be disastrous.
Technically, this problem is not caused by context.Context
and using small deadlines with net.Conn
can cause similar issues. But because context.Context
imposes a single timeout on all operations that use the context, each individual operation has a random timeout which depends on timings of previous operations.
What to do instead?
Your first option is to use fixed net.Conn
deadlines:
var cn net.Conn
cn.SetDeadline(time.Now().Add(3 * time.Second))
With go-redis, you can use ReadTimeout
and WriteTimeout
options which control net.Conn
deadlines:
rdb := redis.NewClient(&redis.Options{
ReadTimeout: 3 * time.Second,
WriteTimeout: 3 * time.Second,
})
Alternatively, you can also use a separate context timeout for each operation:
ctx := context.Background()
op1(ctx.WithTimeout(ctx, time.Second))
op2(ctx.WithTimeout(ctx, time.Second))
You should also avoid timeouts smaller than 1 second, because they have the same problem. If you must deliver a SLA no matter what, you can make sure to generate a response in time but let the operation to continue in background:
func handler(w http.ResponseWriter, req *http.Request) {
// Process asynchronously in a goroutine.
ch := process(req)
select {
case res := <-ch:
// success
case <-time.After(time.Second):
// unknown result
}
}
Also see Context deadline exceeded.