Skip to content

conncheck for badconn? #1821

@hidu

Description

@hidu
Contributor

Issue tracker is used for reporting bugs and discussing new features. Please use
stackoverflow for supporting issues.

when redis server restart, the conns in Pooler can not rw,
they well occur io.EOF error

we can check the bad conns like this:
https://github.com/go-sql-driver/mysql/blob/master/conncheck.go

Expected Behavior

Pooler.Get(context.Context) (*Conn, error)
can remove the bad conn

Current Behavior

Pooler.Get return a bad conn, it occur io.EOF error when we rw it

Possible Solution

check the conn before we use:
https://github.com/go-sql-driver/mysql/blob/master/conncheck.go

Steps to Reproduce

Context (Environment)

Detailed Description

Possible Implementation

Activity

vmihailenco

vmihailenco commented on Jul 14, 2021

@vmihailenco
Collaborator

@hidu looks pretty neat!

monkey92t

monkey92t commented on Jul 14, 2021

@monkey92t
Collaborator

After testing, it will increase the CPU time by about 5-10%. It looks good.

Are you interested in submitting a PR?

hidu

hidu commented on Jul 14, 2021

@hidu
ContributorAuthor

i'll do it

linked a pull request that will close this issueConnPool check fd for bad conns #1824on Jul 19, 2021
added a commit that references this issue on Jul 20, 2021
bingzhuo2008

bingzhuo2008 commented on Aug 5, 2021

@bingzhuo2008

With this feature, my connections are closed and re created again and again as syscall.Read always said "resource temporarily unavailable"

bingzhuo2008

bingzhuo2008 commented on Aug 5, 2021

@bingzhuo2008

I just fallbacked to previous version v8.11.0, and my connections stopped re creating. I'm running it in docker debian. Please test this code in docker before next version published, thank you ! @monkey92t @hidu

monkey92t

monkey92t commented on Aug 5, 2021

@monkey92t
Collaborator

I just fallbacked to previous version v8.11.0, and my connections stopped re creating. I'm running it in docker debian. Please test this code in docker before next version published, thank you ! @monkey92t @hidu

I ran a simple test in debian and it seems to work well (v8.11.1), can you provide the test code and debian version?

bingzhuo2008

bingzhuo2008 commented on Aug 5, 2021

@bingzhuo2008

I just fallbacked to previous version v8.11.0, and my connections stopped re creating. I'm running it in docker debian. Please test this code in docker before next version published, thank you ! @monkey92t @hidu

I ran a simple test in debian and it seems to work well (v8.11.1), can you provide the test code and debian version?

I'm using redis cluster which has 20 master nodes and 20 slave nodes, in k8s docker, Debian 4.9.144-3 (2019-02-02) x86_64 GNU/Linux(I guess it's not debian who caused the problemn, maybe it's docker?)

When the problemn happened, there are always about 200+ TIME_WAIT connections in the container beside new ESTABLISHED connections. I printed err of "syscall.Read" in conncheck.go and it said "resource temporarily unavailable".

But with v8.11.0 and the same code there is not the problemn.

import (
    gredis "github.com/go-redis/redis/v8"
)
 
type RedisCluster struct {
	client *gredis.ClusterClient
}

type RedisClusterConf struct {
	redis.Config
	Backends []string
}

func NewRedisCluster(c *RedisClusterConf) *RedisCluster {
	r := gredis.NewClusterClient(&gredis.ClusterOptions{
		Addrs:              c.Backends,
		DialTimeout:        time.Duration(c.DialTimeout),
		ReadTimeout:        time.Duration(c.ReadTimeout),
		WriteTimeout:       time.Duration(c.WriteTimeout),
		IdleTimeout:        time.Duration(c.IdleTimeout),
		PoolTimeout:        time.Duration(c.WaitTimeout),
		PoolSize:           c.Active,
		MinIdleConns:       c.Idle,
		IdleCheckFrequency: -1,
	})
	return &RedisCluster{
		client: r,
	}
}

func (r *RedisCluster) GetValue(ctx context.Context, key string) (data []byte, err error) {
	st := time.Now()
	data, err = r.client.Get(ctx, key).Bytes()
	if err == gredis.Nil {
		err = nil
	}
	if err != nil {
		var sb util.StringBuilder
		log.Errorv(ctx, log.KVString("log", sb.AppendStrings("redis get err:", err.Error()).ToString()))
	}
	return
}
monkey92t

monkey92t commented on Aug 5, 2021

@monkey92t
Collaborator

I just fallbacked to previous version v8.11.0, and my connections stopped re creating. I'm running it in docker debian. Please test this code in docker before next version published, thank you ! @monkey92t @hidu

I ran a simple test in debian and it seems to work well (v8.11.1), can you provide the test code and debian version?

I'm using redis cluster which has 20 master nodes and 20 slave nodes, in k8s docker, Debian 4.9.144-3 (2019-02-02) x86_64 GNU/Linux(I guess it's not debian who caused the problemn, maybe it's docker?)

When the problemn happened, there are always about 200+ TIME_WAIT connections in the container beside new ESTABLISHED connections. I printed err of "syscall.Read" in conncheck.go and it said "resource temporarily unavailable".

But with v8.11.0 and the same code there is not the problemn.

import (
    gredis "github.com/go-redis/redis/v8"
)
 
type RedisCluster struct {
	client *gredis.ClusterClient
}

type RedisClusterConf struct {
	redis.Config
	Backends []string
}

func NewRedisCluster(c *RedisClusterConf) *RedisCluster {
	r := gredis.NewClusterClient(&gredis.ClusterOptions{
		Addrs:              c.Backends,
		DialTimeout:        time.Duration(c.DialTimeout),
		ReadTimeout:        time.Duration(c.ReadTimeout),
		WriteTimeout:       time.Duration(c.WriteTimeout),
		IdleTimeout:        time.Duration(c.IdleTimeout),
		PoolTimeout:        time.Duration(c.WaitTimeout),
		PoolSize:           c.Active,
		MinIdleConns:       c.Idle,
		IdleCheckFrequency: -1,
	})
	return &RedisCluster{
		client: r,
	}
}

func (r *RedisCluster) GetValue(ctx context.Context, key string) (data []byte, err error) {
	st := time.Now()
	data, err = r.client.Get(ctx, key).Bytes()
	if err == gredis.Nil {
		err = nil
	}
	if err != nil {
		var sb util.StringBuilder
		log.Errorv(ctx, log.KVString("log", sb.AppendStrings("redis get err:", err.Error()).ToString()))
	}
	return
}

Can you provide the docker image tag used? For example: debian:stretch-slim

monkey92t

monkey92t commented on Aug 5, 2021

@monkey92t
Collaborator

It is difficult for me to achieve the same test environment as yours, I hope you can provide the debian mirror address.

I did not find any problems in the debian:latest version.

monkey92t

monkey92t commented on Aug 5, 2021

@monkey92t
Collaborator

I am not familiar with the debian system, and cannot find the 4.9.144-3 (2019-02-02) version you describe.

In addition, can you provide your contact information?

bingzhuo2008

bingzhuo2008 commented on Aug 6, 2021

@bingzhuo2008

I am not familiar with the debian system, and cannot find the 4.9.144-3 (2019-02-02) version you describe.

In addition, can you provide your contact information?

I have to ask other workmates for the docker image information. By the way, the redis cluster version is 4.0.12.
My e-mail is : bingzhuo2008@gmail.com
Or if you use QQ, my QQ is 45563946

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @vmihailenco@hidu@monkey92t@bingzhuo2008

      Issue actions

        conncheck for badconn? · Issue #1821 · redis/go-redis