Skip to content

[bridge] Marks stuck stopping and pending instances as stopped #13350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 29, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 33 additions & 28 deletions components/ws-manager-bridge/src/bridge.ts
Original file line number Diff line number Diff line change
Expand Up @@ -452,7 +452,13 @@ export class WorkspaceManagerBridge implements Disposable {

// Control running workspace instances against ws-manager
try {
await this.controlNonStoppedWSManagerManagedInstances(ctx, nonStoppedInstances, clientProvider);
await this.controlNonStoppedWSManagerManagedInstances(
ctx,
nonStoppedInstances,
clientProvider,
this.config.timeouts.pendingPhaseSeconds,
this.config.timeouts.stoppingPhaseSeconds,
);

disconnectStarted = Number.MAX_SAFE_INTEGER; // Reset disconnect period
} catch (err) {
Expand Down Expand Up @@ -489,6 +495,8 @@ export class WorkspaceManagerBridge implements Disposable {
parentCtx: TraceContext,
runningInstances: RunningWorkspaceInfo[],
clientProvider: ClientProvider,
pendingPhaseSeconds: number,
stoppingPhaseSeconds: number,
) {
const installation = this.config.installation;

Expand All @@ -508,36 +516,33 @@ export class WorkspaceManagerBridge implements Disposable {
for (const [instanceId, ri] of runningInstancesIdx.entries()) {
const instance = ri.latestInstance;
const phase = instance.status.phase;
if (phase !== "running") {
// This below if block is to validate the planned fix
if (
phase === "pending" ||
phase === "creating" ||
phase === "initializing" ||
(phase === "stopping" &&
instance.stoppingTime &&
durationLongerThanSeconds(Date.parse(instance.stoppingTime), 10))
) {
log.info(
{ instanceId, workspaceId: instance.workspaceId },
"Logging to validate #12902. Should mark as stopped in database.",
{ installation },
{ phase },
);
}
log.debug({ instanceId }, "Skipping instance", {
phase: instance.status.phase,
creationTime: instance.creationTime,
region: instance.region,
});

// When ws-manager is not aware of the following instances outside of the timeout duration,
// they should be marked as stopped.
// pending states timeout is 1 hour after creationTime.
// stopping states timeout is 1 hour after stoppingTime.
if (
phase === "running" ||
(phase === "pending" &&
durationLongerThanSeconds(Date.parse(instance.creationTime), pendingPhaseSeconds)) ||
(phase === "stopping" &&
instance.stoppingTime &&
durationLongerThanSeconds(Date.parse(instance.stoppingTime), stoppingPhaseSeconds))
) {
log.info(
{ instanceId, workspaceId: instance.workspaceId },
"Database says the instance is present, but ws-man does not know about it. Marking as stopped in database.",
{ installation, phase },
);
await this.markWorkspaceInstanceAsStopped(ctx, ri, new Date());
continue;
}

log.info(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@laushinka Sorry for the long turnaround. I like the new layout of the loop! But we need it to include this case: If an instance in runningInstancesIdx is running, we still need to stop it unconditionally, as we do it at the moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense that we should handle the running state as we do it now. Thanks, will add it!

"Database says the instance is running, but wsman does not know about it. Marking as stopped in database.",
{ instanceId, workspaceId: instance.workspaceId, installation, phase },
);
await this.markWorkspaceInstanceAsStopped(ctx, ri, new Date());
log.debug({ instanceId }, "Skipping instance", {
phase: phase,
creationTime: instance.creationTime,
region: instance.region,
});
}

log.debug("Done controlling running instances.", { installation });
Expand Down
2 changes: 2 additions & 0 deletions components/ws-manager-bridge/src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ export interface Configuration {
preparingPhaseSeconds: number;
buildingPhaseSeconds: number;
unknownPhaseSeconds: number;
pendingPhaseSeconds: number;
stoppingPhaseSeconds: number;
};

// emulatePreparingIntervalSeconds configures how often we check for Workspaces in phase "preparing" for clusters we do not govern
Expand Down
6 changes: 4 additions & 2 deletions install/installer/cmd/testdata/render/aws-setup/output.golden

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions install/installer/cmd/testdata/render/gcp-setup/output.golden

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 4 additions & 2 deletions install/installer/cmd/testdata/render/minimal/output.golden

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ func configmap(ctx *common.RenderContext) ([]runtime.Object, error) {
PreparingPhaseSeconds: 3600,
BuildingPhaseSeconds: 3600,
UnknownPhaseSeconds: 600,
PendingPhaseSeconds: 3600,
StoppingPhaseSeconds: 3600,
},
EmulatePreparingIntervalSeconds: 10,
StaticBridges: WSManagerList(ctx),
Expand Down
2 changes: 2 additions & 0 deletions install/installer/pkg/components/ws-manager-bridge/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ type Timeouts struct {
PreparingPhaseSeconds int32 `json:"preparingPhaseSeconds"`
BuildingPhaseSeconds int32 `json:"buildingPhaseSeconds"`
UnknownPhaseSeconds int32 `json:"unknownPhaseSeconds"`
PendingPhaseSeconds int32 `json:"pendingPhaseSeconds"`
StoppingPhaseSeconds int32 `json:"stoppingPhaseSeconds"`
}

// WorkspaceCluster from components/gitpod-protocol/src/workspace-cluster.ts
Expand Down