fix(coderd): fix flake in TestAPI/ModifyAutostopWithRunningWorkspace
#18932
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes coder/internal#521
This happened due to a race condition present in how
AwaitWorkspaceBuildJobCompleted
works.AwaitWorkspaceBuildJobCompleted
works by waiting until/api/v2/workspacesbuilds/{workspacebuild}/
returns a workspace build with.Job.CompletedAt != nil
. The issue here is that sometimes the returnedcodersdk.WorkspaceBuild
can contain a build from before a provisioner job completed, but contain the provisioner job from after it completed.Let me demonstrate:
Here we query the database for
database.WorkspaceBuild
.coder/coderd/coderd.go
Lines 1409 to 1415 in a3f64f7
Inside of the
workspaceBuild
route handler, we callworkspaceBuildsData
coder/coderd/workspacebuilds.go
Line 54 in a3f64f7
This then calls
GetProvisionerJobsByIDsWithQueuePosition
coder/coderd/workspacebuilds.go
Lines 852 to 856 in a3f64f7
As these two calls happen outside of a transaction, the state of the world can change underneath. This can result in an in-progress workspace build having a completed provisioner job attached to it.
Note: The change in this PR only touches the flakey test. The underlying cause of the flake isn't being fixed. I'm happy to expand the scope of this PR to fix the cause of the flake as that might also fix other flakes.