Skip to content

x/build/cmd/gomote: instances are timing out after 2 hours #54696

Closed
@cagedmantis

Description

@cagedmantis

There have been reports that instances have been timing sometime after two hours of use.

  • linux-amd64-alpine instances timeout after 2 hours.
  • Other instances have become unresponsive some time after 2 hours of use (Windows-amd64).

Durring one occurance, we collected the following logs:

2022-08-23 13:47:59.025 EDT2022/08/23 17:47:59 created buildlet userx-windows-amd64-2016-0 for userx (http://10.128.0.20 GCE VM: buildlet-windows-amd64-2016-rna896368)
2022-08-23 15:49:13.486 EDT2022/08/23 19:49:13 deleting VM "buildlet-windows-amd64-2016-rna896368" in zone "us-central1-c"; delete-at expiration ...
2022-08-23 15:49:13.906 EDT2022/08/23 19:49:13 Sent request to delete instance "buildlet-windows-amd64-2016-rna896368" in zone "us-central1-c". Operation ID, Name: 219625075713282518, operation-1661284153488-5e6eddbd6f518-f4db622d-0bbce859
2022-08-23 15:50:21.040 EDT2022/08/23 19:50:21 Buildlet http://10.128.0.20 GCE VM: buildlet-windows-amd64-2016-rna896368 failed three heartbeats; final error: timeout waiting for headers
2022-08-23 16:51:00.891 EDT2022/08/23 20:51:00 10.128.0.20:80: peer dead with Buildlet http://10.128.0.20 GCE VM: buildlet-windows-amd64-2016-rna896368 failed heartbeat after 20.000336477s; marking dead; err=timeout waiting for headers, waiting for headers for /exec
2022-08-23 16:53:51.295 EDT2022/08/23 20:53:51 10.128.0.20:80: peer dead with Buildlet http://10.128.0.20 GCE VM: buildlet-windows-amd64-2016-rna896368 failed heartbeat after 20.000336477s; marking dead; err=timeout waiting for headers,waiting for headers for /exec

@golang/release @prattmic @thanm

Activity

added
Buildersx/build issues (builders, bots, dashboards)
NeedsFixThe path to resolution is known, but the work has not been done.
on Aug 26, 2022
added this to the Backlog milestone on Aug 26, 2022
gopherbot

gopherbot commented on Aug 29, 2022

@gopherbot
Contributor

Change https://go.dev/cl/426015 mentions this issue: cmd/coordinator: check the session pool for buildlets

modified the milestones: Backlog, Unreleased on Sep 26, 2022
dmitshur

dmitshur commented on Sep 27, 2022

@dmitshur
Member

We believe this is fixed.

moved this to Done in Go Releaseon Sep 27, 2022
locked and limited conversation to collaborators on Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Buildersx/build issues (builders, bots, dashboards)FrozenDueToAgeNeedsFixThe path to resolution is known, but the work has not been done.

    Type

    No type

    Projects

    Status

    Done

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @cagedmantis@dmitshur@gopherbot

        Issue actions

          x/build/cmd/gomote: instances are timing out after 2 hours · Issue #54696 · golang/go