Prototype for bot that adds label to PRs with merge conficts #187

pablogsal · 2017-10-08T19:27:34Z

This PR is a prototype solution for #96.

On this PR

This PR is NOT the final solution or a final proposed solution, but this initial version of the code shows how is possible to deal with GitHub's API request limit using the GraphQL API to query all the open pull requests at once and then act accordingly.

Some stats on the current state of Python/cpython:

Only 5 requests are needed to retrieve ALL open pull request information (in this example only the number, URL, and mergeable state is requested). 450 open pull request was retrieved. 🎉
From this 450 open pull requests only 94 present merge conflicts. 🥇
As the GraphQL doesn't provide a way to add labels in one request, the REST API is needed to add the label one by one, making 94 HTTP request needed to update the labels. 😞

On the WebHook method

Another possibility that one may think is using a Webhook that triggers when a pull request is updated so we can retrieve if the pull request has merge conflicts from there and update accordingly. Although this is much cleaner it presents a major problem: the mergeable state is usually not set in the webhook request because the GitHub job that inspects it is not triggered at the moment of the webhook being delivered. This means that we would need a REST request after some seconds to check the state manually.

This is more elegant in the sense that we don't need to run cron jobs daily but a lot of REST API calls will be needed.

I can implement a final version of whatever method you prefer, but I think an extended conversation measuring the pros and cons is needed to achieve a good (and simple) solution. 😄

Mariatta · 2017-10-09T02:05:28Z

Thanks for the PR, @pablogsal . I think this is a good start, I like the approach of using GraphQL to retrieve the PRs and then making REST API calls to apply the label.

For using webhook instead of a daily cron:
I think it's unlikely for a PR to suddenly become unmergeable after it's been updated. More likely is it becomes unmergeable as a result that a different PR gets merged. So what we can do is to call the GraphQL query to find out which other PRs becomes unmergeable after each merged PR event.
This will happen more frequently than the daily cronjob.

pablogsal · 2017-10-09T13:08:14Z

@Mariatta Some preferences for the implementation? Async or flask-sync? Do you want it to defer the work to celery-based workers (as opposed to the webhook doing the work in the same request)? Should I host the webhook (I am happy to)?

Sorry for the bothering but I prefer not to assume anything so you can have the implementation you want. 😃

Mariatta · 2017-10-10T16:39:02Z

I think I would like this to follow bedevere's design, using gidgethub and aiohttp, and hosted on heroku.

@brettcannon What do you think? Can this go into bedevere, or a separate app?

From @pablogsal prototype, using GraphQL we can retrieve 450 open PRs. During the first run, it will process 94 open PRs using REST APIs. After that, the number of REST API called will be much less.

terryjreedy · 2017-10-10T17:08:42Z

As a user, two bot names other than the CI bots seems like enough, whatever the hidden implementation. Bedevere helps get PRs merged. Islington helps create backport PRs.

brettcannon · 2017-10-10T17:13:47Z

Do we know how long the querying of 450 open PRs takes? My worry is that on every merged PR we will time out on the webhook because the processing time takes longer than a single HTTP request. And if that does end up being the situation then we're going to have to set up a queuing service to queue up the requests that will take longer than an HTTP request and then an appropriate system which will fetch work from the queue, all of which ups the complexity by a lot.

@pablogsal Beyond Mariatta's suggestion on how to structure things, you can also simplify your GraphQL query by using variables if there is any part of the query that isn't static.

Mariatta · 2017-10-10T18:03:39Z

miss-islington uses celery tasks on Heroku, it's been working well. Long-running tasks like cloning CPython repo, and fetching from upstream, are deferred to background tasks.
We can set up the same if timing out is a concern :)

pablogsal · 2017-10-15T01:05:58Z

I have added a first working version of the bot and some tests. The strategy that I have followed is the following:

Instead of using celery tasks I am acknowledging the request immediately and launching the task that labels the PR in the background. If you feel that celery tasks will suit better this use case, I can prepare the changes but I have benchmarked 🐎 this solution again a testing repo on GitHub and the results are very good, as all the work is I/O based.
As discussed, GraphQL is used first to fetch the open pull requests and process them. I have considered the use of variables, but taking into account that a string with the variables will be as well included in the query string and I am constructing dynamically the query (not with values but the query itself), I feel the current approach is not a bad idea, though I will change it if you feel a different approach is definitely better in this case.
aiohttp and gidgethub are used to the rest of the queries as indicated by @Mariatta and @brettcannon.

This is probably not the final version and there is probably some further discussion needed, but I think having this first working version will help to indicate more concrete changes/behaviour.

Thank you very much for the time dedicated reviewing this PR. 🙇‍♂️

pablogsal · 2017-11-29T00:36:17Z

@Mariatta @brettcannon There is something extra I should do on my part before you can review this? Maybe I should reopen a PR in some bot's repository?

Mariatta · 2017-11-29T17:48:03Z

Thanks @pablogsal. Seems like we haven't decided where to include this code. I was thinking it can be part of bedevere, but there were concerns about hitting the API limit.

brettcannon · 2017-11-29T18:32:01Z

I'm also just swamped with other work and life stuff.

Mariatta · 2019-08-01T00:08:27Z

Sorry for such delay in reviewing this PR. I think it looks great. I would suggest adding this to bedevere. I haven't seen rate-limit error from bedevere yet, so maybe no concern there. Also, bedevere has done most of the labeling work.

pradyunsg · 2019-08-01T07:13:55Z

FWIW, pypa has similar functionality on @pypa-bot (we're using it for pip). I think it'd be great if we could switch to using the same bot as CPython for this stuff.

https://github.com/pypa/browntruck/blob/twisted/browntruck/hooks/merge_conflict.py

brettcannon · 2019-08-02T20:22:58Z

@pradyunsg especially since the pypa bot is not seemingly actively maintained (I've had a PR opened for two years to rip out custom code to rely more on gidgethub that hasn't gone anywhere).

pradyunsg · 2019-08-03T08:10:24Z

I've been meaning to look but I have no access to the deployed bot myself, and without access to the deployed bot's logs, I don't feel confident merging those changes. 🙃

Add first version of the solution

53dadd2

the-knights-who-say-ni added the CLA signed label Oct 8, 2017

Mariatta mentioned this pull request Oct 10, 2017

Add a bot to notify people when their PR has a merge conflict #96

Closed

Mariatta added the hacktoberfest label Oct 11, 2017

Add first version of the bot + tests

127522d

Mariatta added the status-stale The PR hasn't been updated in a while, and pending removal label Jul 21, 2018

pablogsal closed this Nov 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Prototype for bot that adds label to PRs with merge conficts #187

Prototype for bot that adds label to PRs with merge conficts #187

Uh oh!

pablogsal commented Oct 8, 2017 •

edited

Loading

Uh oh!

Mariatta commented Oct 9, 2017

Uh oh!

pablogsal commented Oct 9, 2017 •

edited

Loading

Uh oh!

Mariatta commented Oct 10, 2017

Uh oh!

terryjreedy commented Oct 10, 2017

Uh oh!

brettcannon commented Oct 10, 2017

Uh oh!

Mariatta commented Oct 10, 2017

Uh oh!

pablogsal commented Oct 15, 2017

Uh oh!

pablogsal commented Nov 29, 2017

Uh oh!

Mariatta commented Nov 29, 2017 •

edited

Loading

Uh oh!

brettcannon commented Nov 29, 2017

Uh oh!

Mariatta commented Aug 1, 2019

Uh oh!

pradyunsg commented Aug 1, 2019

Uh oh!

brettcannon commented Aug 2, 2019

Uh oh!

pradyunsg commented Aug 3, 2019 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Prototype for bot that adds label to PRs with merge conficts #187

Prototype for bot that adds label to PRs with merge conficts #187

Uh oh!

Conversation

pablogsal commented Oct 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

On this PR

On the WebHook method

Uh oh!

Mariatta commented Oct 9, 2017

Uh oh!

pablogsal commented Oct 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mariatta commented Oct 10, 2017

Uh oh!

terryjreedy commented Oct 10, 2017

Uh oh!

brettcannon commented Oct 10, 2017

Uh oh!

Mariatta commented Oct 10, 2017

Uh oh!

pablogsal commented Oct 15, 2017

Uh oh!

pablogsal commented Nov 29, 2017

Uh oh!

Mariatta commented Nov 29, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brettcannon commented Nov 29, 2017

Uh oh!

Mariatta commented Aug 1, 2019

Uh oh!

pradyunsg commented Aug 1, 2019

Uh oh!

brettcannon commented Aug 2, 2019

Uh oh!

pradyunsg commented Aug 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pablogsal commented Oct 8, 2017 •

edited

Loading

pablogsal commented Oct 9, 2017 •

edited

Loading

Mariatta commented Nov 29, 2017 •

edited

Loading

pradyunsg commented Aug 3, 2019 •

edited

Loading