Connection health checks completely ineffective for TLS connections

This is to report a bug in the health check logic for TLS connections. Specifically, in the `connCheck` function in the `internal/pool/conn_check.go` file [here](https://github.com/redis/go-redis/blob/f752b9a9d5cc158381c2ffe5b13c531037426b39/internal/pool/conn_check.go#L19).  It leads to unintentional exhaustions of retry count and ultimately command failures, in the presence of server-side disconnections.

`go-redis` with non-TLS does not have this problem. 

## Expected Behavior

The intended use case of connection health check, when [picking connections from pool, a health check is made](https://github.com/redis/go-redis/blob/f752b9a9d5cc158381c2ffe5b13c531037426b39/internal/pool/pool.go#L259-L299). Bad connections are thrown away immediately. The code keeps picking until a good connection is found. If all pooled connections are bad, a new connection is made. Throwing away a bad connection does not consume retry count. Only when a error happened when using a picked connection to send a command, that error would [consume a retry count to be retried](https://github.com/redis/go-redis/blob/f752b9a9d5cc158381c2ffe5b13c531037426b39/redis.go#L400-L446).

## Current Behavior

The specific bug is, when using TLS, the input argument of the `connCheck` [function](https://github.com/redis/go-redis/blob/f752b9a9d5cc158381c2ffe5b13c531037426b39/internal/pool/conn_check.go#L19) is of `tls.Conn` type. `tls.Conn` [type](https://pkg.go.dev/crypto/tls#Conn) does not implement the `syscall.Conn` [interface](https://pkg.go.dev/syscall#Conn). As result, the type conversion [here](https://github.com/redis/go-redis/blob/f752b9a9d5cc158381c2ffe5b13c531037426b39/internal/pool/conn_check.go#L19) always returns `ok` being `false` therefore bypassing connection health check entirely for TLS connections. Bad connections in the connection pool are used to send commands, resulting in errors. Every bad connection consumes a retry count. 

## Possible Solution



## Steps to Reproduce


1. Set up client to use TLS. With 20 pool size, and 4 retry count. But this issue will be exposed as long as the retry count is lower than the pool size.
```
        rdb := redis.NewClusterClient(&redis.ClusterOptions{
                Addrs:        []string{""},
                Password:     "",
                PoolSize:     20,
                PoolFIFO:     false,
                MinIdleConns: 10,

                MaxRetries:      4,
                MinRetryBackoff: 8 * time.Millisecond,
                MaxRetryBackoff: 512 * time.Millisecond,

                TLSConfig: &tls.Config{
                        InsecureSkipVerify: true,
                        ServerName:         "you domain",
                },

        })
```

2. Use `client kill type normal` on Redis to kill all existing connections all at once.
3. Observe commands failures on client side.

## Context (Environment)

Many cloud services hosting Redis offers managed replacements of instances, during which connections on the old instance are killed in batch. Due to this bug, it results in commands failures for TLS clusters, but not non-TLS clusters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Connection health checks completely ineffective for TLS connections #3025

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Context (Environment)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Connection health checks completely ineffective for TLS connections #3025

Description

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Context (Environment)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions