Skip to content

x/build/maintner: reports inconsistent world state (e.g., issue state vs issue events) during short windows of time #28226

Open
@dmitshur

Description

@dmitshur

Problem

A program that fetches a maintner corpus and tries to use its data to make decisions may make a mistake, because the world view is inconsistent during short windows of time. Even though the windows are short, it's guaranteed to happen for any daemon that loops over doing corpus updates and making decisions immediately after.

The most visible high-level example of this is #21312.

Cause

This happens because there are effectively two GitHub data sources that are not synchronized:

  1. changes to GitHub state (e.g., issue N now has labels X, Y, Z)
  2. GitHub-generated events (e.g., issue N has had an "unlabeled" event)

To give a concrete example of an inconsistent state that maintner can report, consider when an issue has just been unlabeled. The first mutation received and processed by a corpus.Update call will be that the issue no longer has that label.

The mutation reporting that there has been an unlabeled event on the same issue may come in a few seconds later. Until it does, it will appear that the issue does not have said label and it has never been unlabeled (e.g., !gi.HasLabel("Documentation") && !gi.HasEvent("unlabeled") will be true). Which is not the reality (if one considers the reality to be one where the unlabeled event and its effect to happen simultaneously).

Details

These are two distinct mutations received and processed by corpus.Update method:

received mutation at time t0:
github_issue: <
  owner: "golang"
  repo: "go"
  number: 28103
  updated: <
    seconds: 1539629204
  >
  remove_label: 223401461
>

... (short window during which the issue doesn't have a label,
     but the accompanying "unlabeled" event hasn't been received yet;
     aka an inconsistent world state)

received mutation at time t1:
github_issue: <
  owner: "golang"
  repo: "go"
  number: 28103
  event: <
    id: 1904921842
    event_type: "unlabeled"
    actor_id: 1924134
    created: <
      seconds: 1539629204
    >
    label: <
      name: "Builders"
    >
  >
  event: <
    id: 1904921913
    event_type: "labeled"
    actor_id: 8566911
    created: <
      seconds: 1539629206
    >
    label: <
      name: "Builders"
    >
  >
  event_status: <
    server_date: <
      seconds: 1539629209
    >
  >
>

There is more relevant information in #21312 (comment).

/cc @bradfitz

Activity

added this to the Unreleased milestone on Oct 16, 2018
added
Buildersx/build issues (builders, bots, dashboards)
on Oct 16, 2018
gopherbot

gopherbot commented on Oct 16, 2018

@gopherbot
Contributor

Change https://golang.org/cl/142362 mentions this issue: cmd/gopherbot: reduce gardening reaction time

changed the title [-]x/build/maintner: reports inconsistent world state during short windows of time[/-] [+]x/build/maintner: reports inconsistent world state (e.g., issue state vs issue events) during short windows of time[/+] on Oct 20, 2018
orthros

orthros commented on Oct 29, 2018

@orthros

When working on other issues, I saw that GitHub introduced a "unified" timeline for events on an issue, the Timeline Api. I understand that it is still in beta (since 2016) and would be a major, but it might help fix this issue by providing a single source of truth on a GitHubIssue

dmitshur

dmitshur commented on Oct 29, 2018

@dmitshur
MemberAuthor

@orthros Thanks for pointing that out. The Timeline API can indeed be helpful for eliminating races between issue comments, events, and PR reviews (for #21086).

Something to be mindful of is that it may not, on its own, be enough to solve the most important race: between the issue state (whether it's open or closed, which labels it has applied) and events. Unless we use the events to deduct the state, rather than querying state separately. (But that can be done independently of using the Timeline API.)

Also, for information, the Timeline API is indeed in preview, and in my experience using it, it had some data gap edge cases where I had to fall back to querying reviews separately (e.g., see here). It may have been resolved by now, but it's worth being aware of. It seems there are 2 Timeline APIs in GitHub API v4 (PullRequestTimelineConnection and PullRequestTimelineItemsConnection, the latter being a part of a preview API), in addition to the Timeline API in GitHub API v3 (https://developer.github.com/v3/issues/timeline/).

andybons

andybons commented on Dec 22, 2018

@andybons
dmitshur

dmitshur commented on Dec 22, 2018

@dmitshur
Author
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Buildersx/build issues (builders, bots, dashboards)

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @andybons@dmitshur@orthros@gopherbot

        Issue actions

          x/build/maintner: reports inconsistent world state (e.g., issue state vs issue events) during short windows of time · Issue #28226 · golang/go