-
Notifications
You must be signed in to change notification settings - Fork 948
Description
Problem
As identified in PR #18932, there's a race condition in the workspace build API endpoint that can cause flaky tests and potentially inconsistent API responses.
The issue occurs in the /api/v2/workspacesbuilds/{workspacebuild}/
endpoint where two separate database queries happen outside of a transaction:
- Query for
database.WorkspaceBuild
in the route handler - Call to
GetProvisionerJobsByIDsWithQueuePosition
inworkspaceBuildsData
Because these calls happen outside of a transaction, the state can change between them, resulting in an in-progress workspace build having a completed provisioner job attached to it.
Code References
- Route handler:
Lines 1409 to 1415 in a3f64f7
r.Route("/workspacebuilds/{workspacebuild}", func(r chi.Router) { r.Use( apiKeyMiddleware, httpmw.ExtractWorkspaceBuildParam(options.Database), httpmw.ExtractWorkspaceParam(options.Database), ) r.Get("/", api.workspaceBuild) workspaceBuildsData
call:coder/coderd/workspacebuilds.go
Line 54 in a3f64f7
data, err := api.workspaceBuildsData(ctx, []database.WorkspaceBuild{workspaceBuild}) GetProvisionerJobsByIDsWithQueuePosition
call:coder/coderd/workspacebuilds.go
Lines 852 to 856 in a3f64f7
jobs, err := api.Database.GetProvisionerJobsByIDsWithQueuePosition(ctx, database.GetProvisionerJobsByIDsWithQueuePositionParams{ IDs: jobIDs, StaleIntervalMS: provisionerdserver.StaleInterval.Milliseconds(), }) if err != nil && !errors.Is(err, sql.ErrNoRows) {
Solution
The database queries should be wrapped in a transaction to ensure consistency, or the logic should be restructured to avoid the race condition.
Impact
- Fixes flaky test
TestAPI/ModifyAutostopWithRunningWorkspace
- May fix other similar flakes in the test suite
- Improves API consistency and reliability
Follow-up work identified from PR #18932