fix: make recording completion idempotent#1836
Conversation
| let status = resp.status(); | ||
| let error_body = resp | ||
| .text() | ||
| .await | ||
| .unwrap_or_else(|_| "<no response body>".to_string()); | ||
| if status == reqwest::StatusCode::CONFLICT | ||
| && error_body.contains("Muxing already in progress") |
There was a problem hiding this comment.
Brittle backward-compat string match
The check error_body.contains("Muxing already in progress") hard-codes the exact server error message. If that message is ever edited or the server returns a localized/structured JSON body instead of plain text, the condition silently stops matching: the response is treated as an ordinary error, the client retries three times, and the upload fails. Since the updated server now returns HTTP 200 for this case, this path is only exercised against older deployments — but it would be worth at minimum adding a comment that documents what server version this bridges, so a future maintainer knows when it can be removed.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/desktop/src-tauri/src/api.rs
Line: 342-348
Comment:
**Brittle backward-compat string match**
The check `error_body.contains("Muxing already in progress")` hard-codes the exact server error message. If that message is ever edited or the server returns a localized/structured JSON body instead of plain text, the condition silently stops matching: the response is treated as an ordinary error, the client retries three times, and the upload fails. Since the updated server now returns HTTP 200 for this case, this path is only exercised against older deployments — but it would be worth at minimum adding a comment that documents what server version this bridges, so a future maintainer knows when it can be removed.
How can I resolve this? If you propose a fix, please make it concise.| } else { | ||
| return c.json({ error: "Failed to claim recording complete" }, 500); | ||
| } |
There was a problem hiding this comment.
Unreachable-in-theory branch returns an opaque 500
This else arm fires only when the UPDATE returned 0 affected rows yet the follow-up SELECT found a row whose phase is neither idempotent (processing / generating_thumbnail / complete) nor absent. In normal operation that means the row transitioned from processing → error between the UPDATE and the SELECT (a tight race with the media-server webhook). Because error is claimable, the Rust client's next retry attempt would successfully claim and re-mux — but the 500 response here is indistinguishable from a genuine server failure. Adding a log with the actual existing.phase would make this edge case much easier to diagnose in production.
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/web/app/api/upload/[...route]/recording-complete.ts
Line: 131-133
Comment:
**Unreachable-in-theory branch returns an opaque 500**
This `else` arm fires only when the `UPDATE` returned 0 affected rows yet the follow-up `SELECT` found a row whose phase is neither idempotent (`processing` / `generating_thumbnail` / `complete`) nor absent. In normal operation that means the row transitioned from `processing` → `error` between the UPDATE and the SELECT (a tight race with the media-server webhook). Because `error` is claimable, the Rust client's next retry attempt would successfully claim and re-mux — but the 500 response here is indistinguishable from a genuine server failure. Adding a log with the actual `existing.phase` would make this edge case much easier to diagnose in production.
How can I resolve this? If you propose a fix, please make it concise.| .text() | ||
| .await | ||
| .unwrap_or_else(|_| "<no response body>".to_string()); | ||
| if status == reqwest::StatusCode::CONFLICT |
There was a problem hiding this comment.
String-matching the error body feels a bit brittle here (any wording change would make desktop start erroring again). If 409 is only used to signal the idempotent "already processing" case, I’d treat all 409s as OK.
| if status == reqwest::StatusCode::CONFLICT | |
| if status == reqwest::StatusCode::CONFLICT { | |
| return Ok(()); | |
| } |
| "processing", | ||
| "generating_thumbnail", | ||
| "complete", | ||
| ] satisfies RecordingCompletePhase[]; |
There was a problem hiding this comment.
Small TS typing nit: satisfies validates the elements, but the inferred type here is still string[]. Adding as const keeps it a readonly tuple of literal values.
| ] satisfies RecordingCompletePhase[]; | |
| export const RECORDING_COMPLETE_UNCLAIMABLE_PHASES = [ | |
| "processing", | |
| "generating_thumbnail", | |
| "complete", | |
| ] as const satisfies readonly RecordingCompletePhase[]; |
Greptile Summary
This PR makes the recording-completion flow idempotent end-to-end. The server now claims the mux slot before generating presigned URLs (reducing wasted work on concurrent requests), returns HTTP 200 with typed
alreadyProcessing/alreadyCompleteresponses instead of 409 for duplicate calls, and the Rust desktop client is updated to treat the legacy 409 "Muxing already in progress" response as a success for backward compatibility with older server deployments.recording-complete.ts): Claim step moved before the expensive presigned-URL/manifest work;getRecordingCompleteIdempotentResulthandles already-processing and already-complete states gracefully, with a nested insert path and race-condition re-check for the no-prior-record case.api.rs,upload.rs): 409 with the old error message is now silently treated as success; the last retry error is now appended to the terminal failure message.recording-complete-utils.ts,*.test.ts): Shared phase logic extracted into a typed utility module with unit test coverage.Confidence Score: 4/5
The change is safe to merge; the idempotency logic is well-structured and the existing test suite covers the new utility functions.
The core idempotency logic is sound: claim-before-work is the correct ordering, concurrent-request races are handled at multiple checkpoints, and the catch block correctly resets the phase to error on failure. Two minor rough edges exist — a hard-coded server error message string in the Rust backward-compat check, and an opaque 500 branch reachable via a narrow race condition — but neither affects correctness in normal operation.
The else-branch 500 path in recording-complete.ts (line 132) and the string-match guard in api.rs are worth a second look, but they do not block the change.
Important Files Changed
Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "fix: make recording completion idempoten..." | Re-trigger Greptile