Stop running prebuilds for inactive projects (10+ weeks) #9219

jankeromnes · 2022-04-11T09:52:14Z

Description

Stop running prebuilds for projects that did not start a workspace in the last 10+ weeks.

Related Issue(s)

Fixes #8911
Drive-by fixes prebuild rate limits again

How to test

Add a Project
Verify that Prebuilds are triggered successfully and workspaces can be started
Verify that lastWorkspaceStart and lastWebhookReceived are properly set in d_b_project_usage
Manually set lastWorkspaceStart to a date more than 10 weeks ago in d_b_project_usage
Trigger new Prebuilds -- verify that they now get auto-cancelled on start due to Project inactivity

Release Notes

Stop running prebuilds for projects that did not start a workspace in the last 10+ weeks

Documentation

jankeromnes · 2022-04-11T10:17:56Z

On a side note, we now have both d_b_project_info (contains cached Branches information) and d_b_project_usage (contains two timestamps).

I considered adding the timestamps to d_b_project_info instead of creating a new table, but the concerns seemed somewhat different to me. Happy to discuss/change this though!

gtsiolis · 2022-04-11T17:04:24Z

fyi: I've opened a follow up issue with a feature request to surface this automatic pause action better on the dashboard as this could be confusing to users coming back to inactive projects in the dashboard, see #9232. Feedback is welcome!

Cc @jankeromnes @easyCZ @jldec

jankeromnes · 2022-04-11T18:29:03Z

/werft run with-clean-slate-deployment

👍 started the job as gitpod-build-jx-inactive-projects.7

Fixes #8911 Fixes prebuild rate limit

jankeromnes · 2022-04-12T09:31:58Z

Looks like this now when a project becomes inactive, but still triggers new prebuilds:

easyCZ · 2022-04-12T09:37:41Z

components/gitpod-db/src/typeorm/entity/db-project-usage.ts

+    @PrimaryColumn(TypeORM.UUID_COLUMN_TYPE)
+    projectId: string;
+
+    @Column("varchar")


What's historically the reason for using varchar to store timestamps? Is it possible to use a DATETIME?

Good question -- I don't have a strong opinion here, and was just following established practice 😊

The reason I ask is that when prototyping an ORM for Golang, Varchar timestamps make it much more complicated - this is further coupled with the somewhat non-standard usage of ISO8601 in javascript to represent a timestamp, which isn't quite RFC3339 compatible.

See https://github.com/gitpod-io/gitpod/pull/8967/files#diff-64ef712fa293d2472e903c3b14ffbaceb31a8b65b79e0f7a31316dbdcbe0c702R18

Definitely not trying to block this PR on this, mostly looking to understand historical reasoning for the varchar choice. Feel free to ignore this comment to land the change.

The reason I ask is that when prototyping an ORM for Golang, Varchar timestamps make it much more complicated - this is further coupled with the somewhat non-standard usage of ISO8601 in javascript to represent a timestamp, which isn't quite RFC3339 compatible.

I think your concerns are valid, and there is definitely room for improvement in our DB types. 👍

Definitely not trying to block this PR on this, mostly looking to understand historical reasoning for the varchar choice. Feel free to ignore this comment to land the change.

Thanks! I would love to, but first I need a review. 😇

Still reviewing :)

That's awesome, thanks a bunch 🙏

I'll try to practice our new review process by assigning this Pull Request to you then (but please feel free to un-assign yourself again if that didn't make sense!)

easyCZ · 2022-04-12T10:39:24Z

One alternative approach we could consider is to load the d_b_project_usage async from existing tables (Prebuild, WorkspaceInstance) to treat it like more of a cache. Doing this would achieve the following:

The handling code for a project doesn't need to be aware of the usage tracking - this eliminates opportunity for a bug where we forget to add the call to track
It derives the usage from a single place, rather than multiple
It makes the update process async which does not impact our latency on project related calls (the setting of the value).

WDYT?

easyCZ

PR looks good, and worked for me based on the test specified. Leaving a couple of questions for my own context. Approving with a hold.

/hold

Posting our slack conversation about splitting PRs here as well for visibility.
For the PR, with regards to tiny PRs. I'd split it into the following:

Add a method to call to updateProjectUsage but don't write to the DB, only increment a metric
Create the table, it's unused
Get data that this works
Wire them together - updateProjectUsage actually writes to the DB

easyCZ · 2022-04-12T10:40:59Z

components/gitpod-db/src/typeorm/entity/db-project-usage.ts

+    projectId: string;
+
+    @Column("varchar")
+    lastWebhookReceived: string;


Is this field strictly needed for our first version? We do set it, but we're not using it in read paths.

Good catch, it's not strictly needed here indeed. However, since I modified very related code, I thought this small drive-by change could be a good first step toward fixing #7010

I'll leave it up to you.

For surfacing which webhook we've received, I'd expose it as a first class-table with webhooks (and webhook payloads), such that we can show what we've received in the webhook as well (would definitely be useful for our debugging).

Good idea. I'll leave this as is, and we can decide later whether we want a full-fledged table of webhook events, or whether showing a simple timestamp of the last event is good enough. 🛹

easyCZ · 2022-04-12T10:44:08Z

components/gitpod-db/src/typeorm/migration/1649667202321-ProjectUsage.ts

+export class ProjectUsage1649667202321 implements MigrationInterface {
+    public async up(queryRunner: QueryRunner): Promise<void> {
+        await queryRunner.query(
+            "CREATE TABLE IF NOT EXISTS `d_b_project_usage` (`projectId` char(36) NOT NULL, `lastWebhookReceived` varchar(255) NOT NULL DEFAULT '', `lastWorkspaceStart` varchar(255) NOT NULL DEFAULT '', `deleted` tinyint(4) NOT NULL DEFAULT '0', `_lastModified` timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6), PRIMARY KEY (`projectId`), KEY `ind_dbsync` (`_lastModified`)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;",


Here's we're setting a limit on projectId of 36 chars, should this also be reflected in the precision: 36 field on the Column of the ORM model?

I'm not too familiar with the intricacies of our ORM model, so I wouldn't know the pros and cons of doing this.

Briefly looking through our code, it seems that we typically use precision for timestamps, but not for IDs. If you think this should be changed, could you please open an issue for this? 🙏

easyCZ · 2022-04-12T10:46:43Z

components/server/ee/src/prebuilds/github-enterprise-app.ts

@@ -116,6 +116,15 @@ export class GitHubEnterpriseApp {
    ): Promise<StartPrebuildResult | undefined> {
        const span = TraceContext.startSpan("GitHubEnterpriseApp.handlePushHook", ctx);
        try {
+            const cloneURL = payload.repository.clone_url;
+            const projectAndOwner = await this.findProjectAndOwner(cloneURL, user);
+            if (projectAndOwner.project) {


At the point of this call, how do we know that the project specified in the payload in fact belongs to the correct user? Is it possible to poison our data across projects here?

Mostly lacking context, so trying to understand better how the payload from the webhook is authhorized

At the point of this call, how do we know that the project specified in the payload in fact belongs to the correct user?

That's a good question. I don't know about webhook events from the GitHub App, but for GitLab, we look at the provided (user + project) token in order to make sure that the project specified in the payload actually belongs to the user who installed this webhook:

gitpod/components/server/ee/src/prebuilds/gitlab-app.ts

Lines 61 to 97 in e8fa6d2

protected async findUser(ctx: TraceContext, context: GitLabPushHook, req: express.Request): Promise<User> {

const span = TraceContext.startSpan("GitLapApp.findUser", ctx);

try {

const secretToken = req.header("X-Gitlab-Token");

span.setTag("secret-token", secretToken);

if (!secretToken) {

throw new Error("No secretToken provided.");

}

const [userid, tokenValue] = secretToken.split("|");

const user = await this.userDB.findUserById(userid);

if (!user) {

throw new Error("No user found for " + secretToken + " found.");

} else if (!!user.blocked) {

throw new Error(`Blocked user ${user.id} tried to start prebuild.`);

}

const identity = user.identities.find((i) => i.authProviderId === TokenService.GITPOD_AUTH_PROVIDER_ID);

if (!identity) {

throw new Error(`User ${user.id} has no identity for '${TokenService.GITPOD_AUTH_PROVIDER_ID}'.`);

}

const tokens = await this.userDB.findTokensForIdentity(identity);

const token = tokens.find((t) => t.token.value === tokenValue);

if (!token) {

throw new Error(`User ${user.id} has no token with given value.`);

}

if (

token.token.scopes.indexOf(GitlabService.PREBUILD_TOKEN_SCOPE) === -1 ||

token.token.scopes.indexOf(context.repository.git_http_url) === -1

) {

throw new Error(

`The provided token is not valid for the repository ${context.repository.git_http_url}.`,

);

}

return user;

} finally {

span.finish();

}

}

We do something similar for GitHub Enterprise webhooks, where the secret token is not sent as part of the webhook payload, but is used to sign the webhook payload (which we can verify against the user's tokens):

gitpod/components/server/ee/src/prebuilds/github-enterprise-app.ts

Lines 89 to 106 in e8fa6d2

// Verify the webhook signature

const signature = req.header("X-Hub-Signature-256");

const body = (req as any).rawBody;

const tokenEntries = (await this.userDB.findTokensForIdentity(gitpodIdentity)).filter((tokenEntry) => {

return tokenEntry.token.scopes.includes(GitHubService.PREBUILD_TOKEN_SCOPE);

});

const signingToken = tokenEntries.find((tokenEntry) => {

const sig =

"sha256=" +

createHmac("sha256", user.id + "|" + tokenEntry.token.value)

.update(body)

.digest("hex");

return timingSafeEqual(Buffer.from(sig), Buffer.from(signature ?? ""));

});

if (!signingToken) {

throw new Error(`User ${user.id} has no token matching the payload signature.`);

}

return user;

Is it possible to poison our data across projects here?

I don't think it is possible to send forged webhook events to Gitpod.

However, your question does remind me of an issue where Pull Requests can come from forks, but trigger a Prebuild for the main repository, which can be problematic in some cases: https://github.com/gitpod-io/security/issues/26

jankeromnes · 2022-04-12T17:06:48Z

Many thanks for your very helpful review and suggestions @easyCZ! 🙏

One alternative approach we could consider is to load the d_b_project_usage async from existing tables (Prebuild, WorkspaceInstance) to treat it like more of a cache. Doing this would achieve the following:

The handling code for a project doesn't need to be aware of the usage tracking - this eliminates opportunity for a bug where we forget to add the call to track

It derives the usage from a single place, rather than multiple

It makes the update process async which does not impact our latency on project related calls (the setting of the value).

WDYT?

I like the idea of doing these things more async, because it could make our system simpler and much less subtle/magical.

(For example, I initially updated lastWorkspaceStart only when starting from a prebuild, so starting a workspace not from a prebuild wouldn't count -- I've since added the missing update code, but it's pretty easy to imagine a future refactor where someone that could be me accidentally forgets one of these updates again.)

However, it feels like we're lacking some good framework code for such asynchronous jobs that regularly process large amounts of data in the background. I agree that we have to start somewhere, but I'm a bit hesitant to volunteer this Pull Request to be the first "async background processing job", given that it can already provide value as is. 🛹

I've also replied to your questions in-line. Many thanks for diving deeper into all these systems! I find your questions and instincts for improvements quite relevant, and am keen to help improve things or provide context whever I can. 🙂

Also, removing the hold, as I think this is good to go as is. Happy to improve anything in follow-ups. 🚀

/unhold

roboquat added do-not-merge/work-in-progress release-note size/L labels Apr 11, 2022

jankeromnes force-pushed the jx/inactive-projects branch from cd00c1b to 3de43d3 Compare April 11, 2022 10:10

roboquat added size/XL and removed size/L labels Apr 11, 2022

jankeromnes force-pushed the jx/inactive-projects branch from 3de43d3 to 192747b Compare April 11, 2022 10:12

roboquat added size/L and removed size/XL labels Apr 11, 2022

jankeromnes force-pushed the jx/inactive-projects branch 2 times, most recently from 3cd8f44 to 815b525 Compare April 11, 2022 15:51

gtsiolis mentioned this pull request Apr 11, 2022

Warn users that prebuilds have been paused for inactive projects #9232

Closed

jankeromnes force-pushed the jx/inactive-projects branch 2 times, most recently from 1a3b8f3 to 36a1a6f Compare April 11, 2022 18:28

jankeromnes force-pushed the jx/inactive-projects branch from 36a1a6f to 93a960c Compare April 11, 2022 19:04

[server] Move noisy GitLab timing log from log.info to log.debug

66d39e5

jankeromnes force-pushed the jx/inactive-projects branch from 93a960c to 904d37a Compare April 12, 2022 08:04

Stop running prebuilds for inactive projects (10+ weeks)

436ceea

Fixes #8911 Fixes prebuild rate limit

jankeromnes force-pushed the jx/inactive-projects branch from 904d37a to 436ceea Compare April 12, 2022 08:50

jankeromnes marked this pull request as ready for review April 12, 2022 09:33

jankeromnes requested a review from a team April 12, 2022 09:33

roboquat removed the do-not-merge/work-in-progress label Apr 12, 2022

github-actions bot added the team: webapp Issue belongs to the WebApp team label Apr 12, 2022

easyCZ reviewed Apr 12, 2022

View reviewed changes

jankeromnes requested review from easyCZ and a team April 12, 2022 09:42

jankeromnes removed the request for review from easyCZ April 12, 2022 09:43

jankeromnes assigned easyCZ Apr 12, 2022

easyCZ approved these changes Apr 12, 2022

View reviewed changes

roboquat added the do-not-merge/hold label Apr 12, 2022

roboquat removed the do-not-merge/hold label Apr 12, 2022

roboquat merged commit ed30d96 into main Apr 12, 2022

roboquat deleted the jx/inactive-projects branch April 12, 2022 17:07

roboquat added deployed: webapp Meta team change is running in production deployed Change is completely running in production labels Apr 19, 2022

	protected async findUser(ctx: TraceContext, context: GitLabPushHook, req: express.Request): Promise<User> {
	const span = TraceContext.startSpan("GitLapApp.findUser", ctx);
	try {
	const secretToken = req.header("X-Gitlab-Token");
	span.setTag("secret-token", secretToken);
	if (!secretToken) {
	throw new Error("No secretToken provided.");
	}
	const [userid, tokenValue] = secretToken.split("\|");
	const user = await this.userDB.findUserById(userid);
	if (!user) {
	throw new Error("No user found for " + secretToken + " found.");
	} else if (!!user.blocked) {
	throw new Error(`Blocked user ${user.id} tried to start prebuild.`);
	}
	const identity = user.identities.find((i) => i.authProviderId === TokenService.GITPOD_AUTH_PROVIDER_ID);
	if (!identity) {
	throw new Error(`User ${user.id} has no identity for '${TokenService.GITPOD_AUTH_PROVIDER_ID}'.`);
	}
	const tokens = await this.userDB.findTokensForIdentity(identity);
	const token = tokens.find((t) => t.token.value === tokenValue);
	if (!token) {
	throw new Error(`User ${user.id} has no token with given value.`);
	}
	if (
	token.token.scopes.indexOf(GitlabService.PREBUILD_TOKEN_SCOPE) === -1 \|\|
	token.token.scopes.indexOf(context.repository.git_http_url) === -1
	) {
	throw new Error(
	`The provided token is not valid for the repository ${context.repository.git_http_url}.`,
	);
	}
	return user;
	} finally {
	span.finish();
	}
	}

	// Verify the webhook signature
	const signature = req.header("X-Hub-Signature-256");
	const body = (req as any).rawBody;
	const tokenEntries = (await this.userDB.findTokensForIdentity(gitpodIdentity)).filter((tokenEntry) => {
	return tokenEntry.token.scopes.includes(GitHubService.PREBUILD_TOKEN_SCOPE);
	});
	const signingToken = tokenEntries.find((tokenEntry) => {
	const sig =
	"sha256=" +
	createHmac("sha256", user.id + "\|" + tokenEntry.token.value)
	.update(body)
	.digest("hex");
	return timingSafeEqual(Buffer.from(sig), Buffer.from(signature ?? ""));
	});
	if (!signingToken) {
	throw new Error(`User ${user.id} has no token matching the payload signature.`);
	}
	return user;

Stop running prebuilds for inactive projects (10+ weeks) #9219

Stop running prebuilds for inactive projects (10+ weeks) #9219

Uh oh!

Conversation

jankeromnes commented Apr 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue(s)

How to test

Release Notes

Documentation

Uh oh!

jankeromnes commented Apr 11, 2022

Uh oh!

gtsiolis commented Apr 11, 2022

Uh oh!

jankeromnes commented Apr 11, 2022 • edited by werft-gitpod-dev-com bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jankeromnes commented Apr 12, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jankeromnes Apr 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

easyCZ commented Apr 12, 2022

Uh oh!

easyCZ left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jankeromnes commented Apr 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jankeromnes commented Apr 11, 2022 •

edited

Loading

jankeromnes commented Apr 11, 2022 •

edited by werft-gitpod-dev-com bot

Loading

jankeromnes Apr 12, 2022 •

edited

Loading

jankeromnes commented Apr 12, 2022 •

edited

Loading