Skip to content

Go SQL client attempts write against broken connections #529

@fewebahr

Description

@fewebahr

Issue description

Intermittently I get failed writes against MariaDB. Lines like the below show up in my logs and data that was supposed to be written is lost. I have not yet tried writing retry logic in my application.

Both my Go app and MariaDB are running in Docker containers. However, I am seeing these errors when they on the same host, connecting over a local Docker network.

Docker version: 1.10.3

Example code

func (z *Zim) saveJob(j *job, transactions ...*gorm.DB) error {

	var (
		transaction *gorm.DB
		err         error
		name, value string
		order       int
		line        string
	)

	switch len(transactions) {
	case 0:
		transaction = z.mariaDBHandle.Begin()
		err = z.saveJob(j, transaction)
		if err == nil {
			transaction.Commit()
		} else {
			transaction.Rollback()
		}
		return err
	case 1:
		transaction = transactions[0]
	default:
		panic("too many transaction handlers passed to saveJob")
	}

	if len(j.ID) == 0 {
		// create the job!

		// assign an ID
		j.ID = uuid.NewV4().String()

		err = transaction.Set("gorm:save_associations", false).Create(j).Error
		if err != nil {
			return err
		}

		// on creation, only save options
		for name, value = range j.Options {
			err = transaction.Set(
				"gorm:save_associations", false,
			).Create(&jobOption{
				ID:    uuid.NewV4().String(),
				JobID: j.ID,
				Name:  name,
				Value: value,
			}).Error
			if err != nil {
				return err
			}
		}
	} else {
		// update the job!

		err = transaction.Set("gorm:save_associations", false).Save(j).Error
		if err != nil {
			return err
		}

		// on update, save Stdout and Stderr lines
		for order, line = range j.Stdout {
			err = transaction.Set(
				"gorm:save_associations", false,
			).Create(&jobStdout{
				ID:    uuid.NewV4().String(),
				JobID: j.ID,
				Order: order,
				Line:  line,
			}).Error
			if err != nil {
				return err
			}
		}

		for order, line = range j.Stderr {
			err = transaction.Set(
				"gorm:save_associations", false,
			).Create(&jobStderr{
				ID:    uuid.NewV4().String(),
				JobID: j.ID,
				Order: order,
				Line:  line,
			}).Error
			if err != nil {
				return err
			}
		}
	}

	return nil
}

Error log

[mysql] 2016/12/14 22:56:20 packets.go:33: unexpected EOF
[mysql] 2016/12/14 22:56:20 packets.go:130: write tcp 172.29.0.16:55610->172.29.0.4:3306: write: broken pipe

Configuration

Driver version (or git SHA):

{
	"checksumSHA1": "xmTgJXWHguGKHPuZE50FIbs88L0=",
	"path": "github.com/go-sql-driver/mysql",
	"revision": "a0583e0143b1624142adab07e0e97fe106d99561",
	"revisionTime": "2016-12-01T11:50:36Z"
},

Go version: run go version in your console

$ go version
go version go1.7.3 darwin/amd64

Server version: E.g. MySQL 5.6, MariaDB 10.0.20

MariaDB: off-the-shelf container mariadb:10.1.19

Server OS: E.g. Debian 8.1 (Jessie), Windows 10

CoreOS 1122.3.0

Activity

methane

methane commented on Dec 15, 2016

@methane
Member
fewebahr

fewebahr commented on Dec 15, 2016

@fewebahr
Author

@methane thank you for the super-fast response! I've put that into my code. I'm going to let it run for a bit, and let you know how it goes!

julienschmidt

julienschmidt commented on Mar 27, 2017

@julienschmidt
Member

@RobertGrantEllis any update?

fewebahr

fewebahr commented on Mar 28, 2017

@fewebahr
Author

Apologies for the slow reply! I put this change in soon after the response from @methane, and intended to watch the production logs for a bit before closing this out. Apparently the latter fell by the wayside...

Anyways-- I set the connection max lifetime to 30 minutes, and it appears to have resolved the issue, so thank you!

I suspect that module-oriented fix could be to constantly read from each connection in the pool-- even in absence of an active query-- then, if an io.EOF error is encountered, mark the connection as closed immediately and then tear it down.

Pseudocode:

for {
  err = conn.Read(buffer)
  if err != nil {
    if err != io.EOF {
      // report error
    }
    // lock ops mutex
    // mark boolean as closed
    conn.Close()
    // release ops mutex
    break
  }
  // report buffer back to whatever query activated the connection
  // continue for iterations
}
methane

methane commented on Mar 29, 2017

@methane
Member

@RobertGrantEllis I agree that checking connection is closed or not before sending command is preferable. But (a) it requires dedicated reader goroutine, and (b) SetConnMaxLifetime() is far better than it.

And even MySQL Connector/C (official implementation of client) doesn't try detecting broken connection before sending packet.

dedicated reader goroutine.

To check connection is closed by peer, conn.Read() is required as you said.
But Go doesn't provide portable way to "non blocking Read".
So it requires dedicated goroutine to read.

Having dedicated two goroutines for read and write is common practice in Go.
And context support requires it too.

If performance overhead of dedicated reader goroutine is small enough,
I'll try to use it always including APIs doesn't support context.

But my forecast is performance overhead is not negligible.

SetConnMaxLifetime() is far better

conn.Read() can detects only clean TCP close immediately. (receiving FIN or RST packet).

But sometimes, TCP connections are broken silently.
conn.Read() may take several minutes to detect silently broken connections.

SetConnMaxLifetime() can solve almost clean TCP close (wait_timeout or other idle timeout in underlying network). And it may solve some of silent TCP breakage too.

SetConnMaxLifetime() has some nice side effects too (e.g. failover, load balancing, changing system variable by set global, closing connections when application is idle, ...).
So I recommend SetConnMaxlifetime() with few seconds ~ few minutes lifetime always.

fewebahr

fewebahr commented on Jun 2, 2017

@fewebahr
Author

@methane thanks for the very detailed explanation. I do understand your points about why SetConnMaxLifetime() is better. I think we can probably close this out (let me know if I should), but one quick suggestion: is it worthwhile to document this limitation in the godoc?

mcandre

mcandre commented on May 6, 2019

@mcandre

To clarify, what should the Go user do to mitigate errors arising from idle connections?

Can go-sql-driver/mysql do a better job detecting dead/idle connections and respond appropriately, so that the database/sql abstraction can meet its commitments on automatic retry, reconnect?

methane

methane commented on May 6, 2019

@methane
Member

Stop multi posting. I replied on other issue.

methane

methane commented on May 6, 2019

@methane
Member

I closed this issue because #934 is the best we can do.
Users must use SetMaxConnLifetime. There are no more thing we can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @mcandre@methane@julienschmidt@fewebahr

        Issue actions

          Go SQL client attempts write against broken connections · Issue #529 · go-sql-driver/mysql