Skip to content

[Proposal] Notify runners to fetch Actions tasks instead of polling #24543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wolfogre opened this issue May 5, 2023 · 5 comments
Open

[Proposal] Notify runners to fetch Actions tasks instead of polling #24543

wolfogre opened this issue May 5, 2023 · 5 comments
Labels
topic/gitea-actions related to the actions of Gitea type/proposal The new feature has not been accepted yet but needs to be discussed first.

Comments

@wolfogre
Copy link
Member

wolfogre commented May 5, 2023

Background

Act runner connects to the Gitea instance via short HTTP connections. So runners have to ask for new task again and again.

Gitea will start a transaction to find and assign a task to the runner when it requests a new one. However, we know that there may not be a task available most of the time, so Gitea has to roll back the transaction and respond with "no task yet, try again later."

That’s why you can see there are so many POST /api/actions/runner.v1.RunnerService/FetchTask in router logs, and [SQL] ROLLBACK [] in SQL logs.

It doesn't hurt but it's annoying. And it wastes network and database resources to some extent.

Solution

Provides a mechanism for Gitea to notify runners

We need a way to notify runners from the Gitea side. Here are some possible ways:

  • gRPC Streaming. Not sure if it works with gRPC over HTTP.
  • WebSocket. Actually, it was used in earlier versions of Gitea Actions.
  • Blocking Queries. It's the way in which the consul watches changes.
  • Server-sent events.

I'm not sure which approach would be best. However, one thing is certain: the approach must be compatible with the proxy. This is because there is always an HTTP proxy in front of Gitea, such as Nginx. IIRC, Nginx with default config closes connections that have been inactive for a long time.

Reuse old logic to fetch task

To be clear, Gitea will notify runners only with a simple message "there may be a new task", instead of the whole task data.

So it could be easier to implement, whenever the relevant data table changes, Gitea could notify all runners. The runners will then request new tasks, just as they do now. If they don't get a new task, maybe the label does’t match, they can just continue to wait to be notified again.

On the other hand, runners may not immediately request new tasks when notified because they may already be at capacity, and they will request later when they are idle.

Therefore, forcefully sending tasks to runners without their request may not be feasible or may be very complicated.

@wolfogre wolfogre added type/proposal The new feature has not been accepted yet but needs to be discussed first. type/feature Completely new functionality. Can only be merged if feature freeze is not active. topic/gitea-actions related to the actions of Gitea labels May 5, 2023
@delvh
Copy link
Member

delvh commented May 5, 2023

Hmm… Yes, as long as that is entirely optional I agree that something like that can be useful.
I'd object to making it required, for example for the following use case:
You only have a runner inside your intranet, and pull tasks from a public Gitea instance.
The only thing I'm thinking about is whether it improves the situation so much that it is worth the effort.
What is the default timeout in act_runner?
5 seconds?

@wolfogre
Copy link
Member Author

wolfogre commented May 5, 2023

... I'd object to making it required, ...

Sure, it's optional. Runners can skip the wait and go straight to regular polling when:

  • Gitea or reverse proxy does't support it.
  • It's disabled on Gitea side.
  • It's disabled on runner side.

... whether it improves the situation so much that it is worth the effort.

Exactly. That's why this is a proposal issue, not a PR.

... What is the default timeout in act_runner? 5 seconds?

5 seconds for timeout, 2 seconds for interval.

@harryzcy
Copy link
Contributor

harryzcy commented May 6, 2023

I feel Blocking Queries, aka long polling would be a natural way to go. This is also how GitHub self-hosted runner do it.

The self-hosted runner uses an HTTPS long poll that opens a connection to GitHub for 50 seconds

Nginx has a timeout of 60s by default. So 50s should work with no configuration required. Maybe Gitea can attempt to poll by default, and then fall back to regular polling if received several 524 Gateway Timeout error.

  • Server-sent events.

I like this option too. It would be optimal when runner is "publicly" accessible (by Gitea) and we can reuse some WebHook logic.

@wolfogre wolfogre removed the type/feature Completely new functionality. Can only be merged if feature freeze is not active. label May 9, 2023
@tonitrnel

This comment was marked as duplicate.

@qibao07
Copy link

qibao07 commented May 11, 2024

Very good proposal.

This is the situation I encountered. The gitea-act node may be placing a high load on gitea's servers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/gitea-actions related to the actions of Gitea type/proposal The new feature has not been accepted yet but needs to be discussed first.
Projects
None yet
Development

No branches or pull requests

5 participants