Retries may be done using broken connections from the pool

## Expected Behavior

When restarting the Redis server, due to the default of `MaxRetries: 3`, I'd expect go-redis to attempt to reconnect to the server and retry the query.

## Current Behavior

If there are enough old connections in the connection pool, all retry attempts are done on broken connections and an error is returned without even attempting to reconnect.

## Possible Solution

Some options:
1. Perform a ping on the connection before reusing it (adds latency if the connection is fine)
2. Don't count retries on connections from the pool towards the retry limit (might lead to a huge number of retries for a single query)
3. Clear the pool on errors (might be overly pessimistic)
4. Always use a fresh connection for retries

## Steps to Reproduce

The following code reproduces the issue. Instead of restarting the server, it sets a dialer that simply closes each connection after 5 seconds.

```go
package main

import (
	"context"
	"github.com/go-redis/redis/v8"
	"log"
	"net"
	"time"
)

func main() {
	ctx := context.Background()

	// Redis client with a dialer that kills connections after 5 seconds (to simulate a server restart).
	client := redis.NewClient(&redis.Options{
		Dialer: func(ctx context.Context, network, addr string) (net.Conn, error) {
			log.Print("Dialer called")
			conn, err := net.Dial(network, addr)
			if err == nil {
				go func() {
					time.Sleep(5 * time.Second)
					conn.Close()
				}()
			}
			return conn, err
		},
	})

	// Function to ping the server and log errors if they occur.
	ping := func() {
		_, err := client.Ping(ctx).Result()
		if err != nil {
			log.Print(err)
		}
	}

	// Perform some pings to fill the connection pool.
	for i := 0; i < 10; i++ {
		go ping()
	}

	// Wait for connections to die.
	time.Sleep(10 * time.Second)

	// Perform another ping and log pool stats before and after.
	log.Printf("%#v", client.PoolStats())
	ping()
	log.Printf("%#v", client.PoolStats())
}
```

Example output (note that the dialer is not called for the last ping and the hit count in the pool stats increases by 4, i.e. the initial attempt and all 3 retries were done using stale connections from the pool):
```
2021/04/23 13:40:50 Dialer called
2021/04/23 13:40:50 Dialer called
2021/04/23 13:40:50 Dialer called
2021/04/23 13:40:50 Dialer called
2021/04/23 13:40:50 Dialer called
2021/04/23 13:40:50 Dialer called
2021/04/23 13:40:50 Dialer called
2021/04/23 13:40:50 Dialer called
2021/04/23 13:40:50 Dialer called
2021/04/23 13:40:50 Dialer called
2021/04/23 13:41:00 &redis.PoolStats{Hits:0x0, Misses:0xa, Timeouts:0x0, TotalConns:0xa, IdleConns:0xa, StaleConns:0x0}
2021/04/23 13:41:00 set tcp 127.0.0.1:45526: use of closed network connection
2021/04/23 13:41:00 &redis.PoolStats{Hits:0x4, Misses:0xa, Timeouts:0x0, TotalConns:0x6, IdleConns:0x6, StaleConns:0x0}
```

You get similar behavior when you restart Redis at the right time instead of using this dialer. In this case, the error will be EOF instead.

When adding `MaxRetries: 20` to the `redis.Options`, the last 3 lines of the output look like this instead (note that there were 10 hits and then 1 miss that called the dialer):
```
2021/04/23 13:48:46 &redis.PoolStats{Hits:0x0, Misses:0xa, Timeouts:0x0, TotalConns:0xa, IdleConns:0xa, StaleConns:0x0}
2021/04/23 13:48:49 Dialer called
2021/04/23 13:48:49 &redis.PoolStats{Hits:0xa, Misses:0xb, Timeouts:0x0, TotalConns:0x1, IdleConns:0x1, StaleConns:0x0}
```

## Context (Environment)

I want my application to gracefully handle Redis starts and had hoped that this was handled automatically by go-redis due to the default of `MaxRetries: 3`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Retries may be done using broken connections from the pool #1737

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Context (Environment)

6 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Retries may be done using broken connections from the pool #1737

Description

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Context (Environment)

Activity

vmihailenco commented on Apr 23, 2021

Al2Klimov commented on Apr 23, 2021

vmihailenco commented on Apr 23, 2021

Al2Klimov commented on Apr 23, 2021

vmihailenco commented on Apr 25, 2021

julianbrost commented on Apr 26, 2021

Al2Klimov commented on Apr 26, 2021

vmihailenco commented on May 5, 2021

Al2Klimov commented on May 5, 2021

vmihailenco commented on May 5, 2021

Al2Klimov commented on May 5, 2021

Al2Klimov commented on May 5, 2021

ryanschneider commented on Jun 25, 2021

ryanschneider commented on Jun 25, 2021

6 remaining items

github-actions commented on Aug 5, 2025

ndyakov commented on Aug 6, 2025

julianbrost commented on Aug 19, 2025

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions