Skip to content

spawnSync: do not forward SIGPWR on Linux#30956

Open
robobun wants to merge 5 commits into
mainfrom
farm/a5eb0a5e/spawnsync-sigpwr
Open

spawnSync: do not forward SIGPWR on Linux#30956
robobun wants to merge 5 commits into
mainfrom
farm/a5eb0a5e/spawnsync-sigpwr

Conversation

@robobun
Copy link
Copy Markdown
Collaborator

@robobun robobun commented May 18, 2026

Fuzzer fingerprint: 11551e051c08db2c

What

Remove SIGPWR from the set of signals that spawnSync forwards to the child on Linux.

Why

On Linux, JSC uses SIGPWR for GC thread suspend/resume (WTF/wtf/posix/ThreadingPOSIX.cpp). Bun__registerSignalsForForwarding() was replacing the SIGPWR handler (with SA_RESETHAND) and stashing the previous handler in a process-global, unsynchronized array.

Bun.openInEditor spawns a detached thread per call that runs spawnSync internally. With many concurrent calls, the register/unregister paths race on the global table and can leave SIGPWR at SIG_DFL. The next time the GC sends SIGPWR to suspend a thread, the process terminates with signal 30.

Minimal repro:

function f() {
  try { f(); } catch {}
  Bun.openInEditor("x");
  throw 1;
}
try { f(); } catch {}

SIGPWR is not a user-visible signal in this context (the npm list it was copied from targets Node/V8, where SIGPWR is unused), so there is no reason to forward it.

Test

Added a Linux-only regression test that points editor auto-detection at a fake codesleep binary, fires many concurrent openInEditor calls so their spawnSync signal-forwarding windows overlap, then allocates enough to trigger a concurrent GC. Before this change the child dies with SIGPWR; after, it exits cleanly.

JSC uses SIGPWR for GC thread suspend/resume on Linux. spawnSync's
signal-forwarding helper installed a SA_RESETHAND handler for SIGPWR
and saved the previous handler into a process-global array with no
synchronization. Bun.openInEditor spawns a detached thread per call
that runs spawnSync; concurrent calls raced on the previous-handler
table and left SIGPWR at SIG_DFL, so the next GC suspend terminated
the process with signal 30.
@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented May 18, 2026

Updated 2:30 AM PT - May 18th, 2026

@robobun, your commit 7e2c283ae6aa6718b2a33752daa0a4e36c768c21 passed in Build #55726! 🎉


🧪   To try this PR locally:

bunx bun-pr 30956

That installs a local version of the PR into your bun-30956 executable, so you can run:

bun-30956 --bun

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

Review Change Stack

Walkthrough

Removed SIGPWR from the Linux-only signal-forwarding macro used by spawnSync, added process-wide mutex and reference-counting around signal handler register/unregister, and added a Linux regression test that stresses overlapping spawns and GC to confirm the SIGPWR handler remains intact.

Changes

Signal Forwarding Fix

Layer / File(s) Summary
Mutex and depth counter for signal forwarding
src/jsc/bindings/c-bindings.cpp
Add a global std::mutex and signalForwardingDepth reference counter to coordinate process-wide signal-forwarding install/remove logic.
Remove SIGPWR from Linux-only signals
src/jsc/bindings/c-bindings.cpp
Updated FOR_EACH_LINUX_ONLY_SIGNAL to exclude SIGPWR, keeping SIGPOLL only.
Guard register/unregister with mutex and depth checks
src/jsc/bindings/c-bindings.cpp
Bun__registerSignalsForForwarding() and Bun__unregisterSignalsForForwarding() now lock the global mutex and use the depth counter to early-return on nested calls so only the outermost call updates signal dispositions.
Linux regression test for SIGPWR and GC stress
test/js/bun/spawn/spawnSync-sigpwr-gc.test.ts
Added a Linux-gated test that symlinks sleep as the editor, spawns overlapping Bun editor processes while forcing GC, and asserts the child exits normally (no signal termination, exit code 0) and stdout/stderr meet expectations.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: removing SIGPWR from signals forwarded by spawnSync on Linux.
Description check ✅ Passed The description provides comprehensive coverage: What (remove SIGPWR), Why (JSC GC conflict and race condition), and Test (Linux-only regression test). All required template sections are well-addressed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/jsc/bindings/c-bindings.cpp (1)

936-953: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Signal-forwarding globals are still unsafe under overlapping spawnSync calls.

Removing SIGPWR fixes this specific crash path, but the forwarding mechanism still relies on unsynchronized process-global state (previous_actions, Bun__currentSyncPID, Bun__pendingSignalToSend). With concurrent spawnSync windows, handlers/PID state can still be clobbered and restored incorrectly for other forwarded signals.

Also applies to: 915-916, 979-1015

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/jsc/bindings/c-bindings.cpp` around lines 936 - 953, The
signal-forwarding globals (previous_actions, Bun__currentSyncPID,
Bun__pendingSignalToSend) used by spawnSync are not safe for overlapping calls;
serialize access and make the handler logic robust: introduce a single shared
mutex (e.g., spawnSyncSignalMutex) and lock it around any code that
installs/restores signal handlers and mutates previous_actions,
Bun__currentSyncPID, and Bun__pendingSignalToSend inside the spawnSync
implementation and its helper functions, and change the signal handler to verify
the PID against an atomic Bun__currentSyncPID before acting (so stray signals
are ignored); this ensures handlers/state cannot be clobbered by concurrent
spawnSync calls and restores handlers only when the caller that set them is
still the owner.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/jsc/bindings/c-bindings.cpp`:
- Around line 936-953: The signal-forwarding globals (previous_actions,
Bun__currentSyncPID, Bun__pendingSignalToSend) used by spawnSync are not safe
for overlapping calls; serialize access and make the handler logic robust:
introduce a single shared mutex (e.g., spawnSyncSignalMutex) and lock it around
any code that installs/restores signal handlers and mutates previous_actions,
Bun__currentSyncPID, and Bun__pendingSignalToSend inside the spawnSync
implementation and its helper functions, and change the signal handler to verify
the PID against an atomic Bun__currentSyncPID before acting (so stray signals
are ignored); this ensures handlers/state cannot be clobbered by concurrent
spawnSync calls and restores handlers only when the caller that set them is
still the owner.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b3b75814-6478-48d0-97e4-fbcdfd357797

📥 Commits

Reviewing files that changed from the base of the PR and between ad68af8 and 4b9d469.

📒 Files selected for processing (2)
  • src/jsc/bindings/c-bindings.cpp
  • test/js/bun/spawn/spawnSync-sigpwr-gc.test.ts

@github-actions
Copy link
Copy Markdown
Contributor

Found 1 issue this PR may fix:

  1. Flaky SIGILL crash in JSC garbage collector during bun test on Linux x64 (1.3.7 and 1.3.9) #27312 - Flaky SIGILL crash in JSC garbage collector during bun test on Linux x64; the intermittent GC crash in SlotVisitor::drain is consistent with a corrupted signal handler from the SIGPWR race condition this PR fixes

If this is helpful, copy the block below into the PR description to auto-close this issue on merge.

Fixes #27312

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional findings (outside current diff — PR may have been updated during review):

  • 🟡 test/js/bun/spawn/spawnSync-sigpwr-gc.test.ts:54 — nit: expect(stderr).not.toContain("AddressSanitizer") is the negative crash-marker pattern that root CLAUDE.md:126 says to NEVER write ("These tests will never fail in CI"). The positive assertions on signalCode, stdout === "ok", and exitCode === 0 already fail if the process dies via SIGPWR or ASAN aborts, so this line adds no signal — just drop it.

    Extended reasoning...

    What the issue is

    test/js/bun/spawn/spawnSync-sigpwr-gc.test.ts:54 adds:

    expect(stderr).not.toContain("AddressSanitizer");

    This is a negative assertion on a crash-marker string in subprocess output. The repo's root CLAUDE.md (line 126) explicitly forbids this pattern:

    NEVER write tests that check for no "panic" or "uncaught exception" or similar in the test output. These tests will never fail in CI.

    "AddressSanitizer" is exactly "or similar" — it's a crash-marker string just like "panic" or "uncaught exception".

    Why this assertion adds no signal — step by step

    Consider the two CI configurations:

    1. Non-ASAN build (the common case in CI). AddressSanitizer is not compiled in, so stderr can never contain "AddressSanitizer" regardless of whether the bug is present. The assertion is vacuously true and cannot fail — this is precisely what the CLAUDE.md rule warns about.

    2. ASAN build, bug regresses. If the SIGPWR handler gets clobbered and the process is terminated by signal 30, the child does not exit cleanly. In that case:

      • proc.signalCode is "SIGPWR", not null → line 53 expect(proc.signalCode).toBeNull() fails.
      • The child never reaches console.log("ok") → line 55 expect(stdout.trim()).toBe("ok") fails.
      • exitCode is non-zero / null → line 56 expect(exitCode).toBe(0) fails.

      And if ASAN itself reports a memory error and aborts, it does so with a non-zero exit code (and typically SIGABRT), so the same three positive assertions catch it before the stderr check matters.

    In every scenario where not.toContain("AddressSanitizer") would fail, at least one of the surrounding positive assertions has already failed. There is no execution path where this line is the sole failing assertion.

    Why the existing assertions are sufficient

    The test already follows the recommended structure: it asserts on signalCode === null, stdout.trim() === "ok", and exitCode === 0. These are positive assertions on observable process state and they fully cover the regression being tested (process terminated by SIGPWR during GC). The stderr check is pure belt-and-suspenders that violates a documented "NEVER" rule.

    Fix

    Just delete line 54. Replacing it with expect(stderr).toBe("") (the CLAUDE.md-preferred form) is also an option but may be too strict here — 64 concurrent openInEditor calls against a fake code binary could plausibly emit incidental noise on stderr without indicating failure, so dropping the line entirely is the safer choice.

    This is a test-hygiene/convention nit, not a functional bug in the SIGPWR fix itself.

Comment thread src/jsc/bindings/c-bindings.cpp
Comment thread src/jsc/bindings/c-bindings.cpp
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/js/bun/spawn/spawnSync-sigpwr-gc.test.ts`:
- Around line 41-49: Change the subprocess spawn options to pipe stderr (replace
stderr: "inherit" with stderr: "pipe"), capture proc.stderr.text() alongside
proc.stdout.text() and proc.exited (e.g. await Promise.all([... ,
proc.stderr.text(), proc.exited])), store that value in a stderr variable, and
insert the CI-friendly assertion if (exitCode !== 0) { expect(stderr).toBe("");
} immediately before expect(exitCode).toBe(0); keep the existing checks on
proc.signalCode and stdout.trim() as-is.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: bea85ffa-50b0-4816-90cc-048b3b916ad6

📥 Commits

Reviewing files that changed from the base of the PR and between ac02046 and 2d5b42e.

📒 Files selected for processing (1)
  • test/js/bun/spawn/spawnSync-sigpwr-gc.test.ts

Comment on lines +41 to +49
stdout: "pipe",
stderr: "inherit",
});

const [stdout, exitCode] = await Promise.all([proc.stdout.text(), proc.exited]);

expect(proc.signalCode).toBeNull();
expect(stdout.trim()).toBe("ok");
expect(exitCode).toBe(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Pipe stderr for better CI failure diagnostics.

With stderr: "inherit", you cannot programmatically assert stderr contents or surface them in the test diff on failure. Per repo conventions for subprocess tests using bunEnv, prefer piping stderr and asserting it's empty (after filtering ASAN noise if needed).

♻️ Suggested change
     stdout: "pipe",
-    stderr: "inherit",
+    stderr: "pipe",
   });

-  const [stdout, exitCode] = await Promise.all([proc.stdout.text(), proc.exited]);
+  const [stdout, stderr, exitCode] = await Promise.all([
+    proc.stdout.text(),
+    proc.stderr.text(),
+    proc.exited,
+  ]);

   expect(proc.signalCode).toBeNull();
   expect(stdout.trim()).toBe("ok");
+  if (exitCode !== 0) {
+    expect(stderr).toBe("");
+  }
   expect(exitCode).toBe(0);

Based on learnings: "In oven-sh/bun Jest/Bun test files under test/js/ that spawn subprocesses using bunEnv from the harness module, it's safe and intentional to assert expect(stderr).toBe("") unconditionally" and "write if (exitCode !== 0) { expect(stderr).toBe(""); } immediately before expect(exitCode).toBe(0)."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
stdout: "pipe",
stderr: "inherit",
});
const [stdout, exitCode] = await Promise.all([proc.stdout.text(), proc.exited]);
expect(proc.signalCode).toBeNull();
expect(stdout.trim()).toBe("ok");
expect(exitCode).toBe(0);
stdout: "pipe",
stderr: "pipe",
});
const [stdout, stderr, exitCode] = await Promise.all([
proc.stdout.text(),
proc.stderr.text(),
proc.exited,
]);
expect(proc.signalCode).toBeNull();
expect(stdout.trim()).toBe("ok");
if (exitCode !== 0) {
expect(stderr).toBe("");
}
expect(exitCode).toBe(0);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/js/bun/spawn/spawnSync-sigpwr-gc.test.ts` around lines 41 - 49, Change
the subprocess spawn options to pipe stderr (replace stderr: "inherit" with
stderr: "pipe"), capture proc.stderr.text() alongside proc.stdout.text() and
proc.exited (e.g. await Promise.all([... , proc.stderr.text(), proc.exited])),
store that value in a stderr variable, and insert the CI-friendly assertion if
(exitCode !== 0) { expect(stderr).toBe(""); } immediately before
expect(exitCode).toBe(0); keep the existing checks on proc.signalCode and
stdout.trim() as-is.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — one-line removal of SIGPWR from the Linux forwarding list, clearly correct given JSC's use of SIGPWR for GC thread suspension.

Extended reasoning...

Overview

This PR removes M(SIGPWR) from the FOR_EACH_LINUX_ONLY_SIGNAL macro in src/jsc/bindings/c-bindings.cpp (one deleted line plus a 3-line explanatory comment) and adds a Linux-only regression test in test/js/bun/spawn/spawnSync-sigpwr-gc.test.ts that stresses concurrent openInEditorspawnSync calls against GC to verify the SIGPWR handler is no longer clobbered.

Security risks

None. The change is purely subtractive — it stops spawnSync from replacing the process-wide SIGPWR handler on Linux. There is no new attack surface, no input handling, and no auth/crypto/permissions involvement. The test only manipulates a temp dir, a symlink to sleep, and PATH for a child Bun process.

Level of scrutiny

Low-to-moderate. Signal-handler registration is process-global state, so changes here deserve a careful read, but this particular change is a strict reduction in scope: one fewer signal is touched. The rationale (JSC uses SIGPWR for GC thread suspend/resume in WTF/wtf/posix/ThreadingPOSIX.cpp) is correct and well-documented, and the npm list this was copied from targets V8 where SIGPWR is unused — so dropping it has no user-visible downside while fixing a real fuzzer-found crash.

Other factors

My two earlier inline comments were explicitly flagged as pre-existing issues (SIGPOLL/SIGIOT aliasing, and the stale main-thread-only comment / broader previous_actions[] race). The author verified the aliasing observation and reasonably scoped both out to a follow-up to keep this PR focused on the fuzzer crash; neither blocks this change. CodeRabbit's stderr-piping suggestion is a test-diagnostics nit, not a correctness issue. No bugs were found by the bug-hunting system on this revision. CI had failures on an earlier commit and a follow-up commit (test: drop negative crash-marker assertion) was pushed; the source change itself is sound regardless.

@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented May 18, 2026

CI status: 68/69 checks pass. The one failure is darwin-aarch64-build-cpp with ninja: build stopped: interrupted by user at step 1093/1186 (compiling boringssl/.../bcm.cc.o) — build agent was interrupted mid-compile, not a code error. This PR's only source change is inside #if OS(LINUX) and isn't compiled on macOS; the same job passed on the previous build (55633).

The new spawnSync-sigpwr-gc.test.ts passed on all Linux targets including the previously-flaky ubuntu-25.04-aarch64.

Already re-rolled once for unrelated flakes on build 55633 (node-http-backpressure SIGKILL, bun-upgrade GitHubIsDown). Not re-rolling again; this just needs the macOS build job retried.

Bun.openInEditor spawns detached threads that call spawnSync, so
register/unregister can run concurrently. Guard previous_actions[]
with a mutex and depth counter so only the outermost pair installs
and restores handlers. Also drop SIGIOT/SIGPOLL from the list since
they alias SIGABRT/SIGIO and the second sigaction() was overwriting
the saved handler with our own.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/jsc/bindings/c-bindings.cpp (1)

1016-1022: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only clear the forwarded PID on the outermost unregister.

Line 1018 resets Bun__currentSyncPID before the depth check. If two spawnSync calls overlap, the first inner unregister leaves signalForwardingDepth > 0 but disables forwarding for the still-active outer call, so a later signal in that window never reaches the child.

Suggested fix
 extern "C" void Bun__unregisterSignalsForForwarding()
 {
-    Bun__currentSyncPID = 0;
-
     std::lock_guard<std::mutex> lock(signalForwardingLock);
     if (--signalForwardingDepth != 0)
         return;
+
+    Bun__currentSyncPID = 0;
 
 `#define` UNREGISTER_SIGNAL(SIG)                                \
     if (sigaction(SIG, &previous_actions[SIG], NULL) == -1) { \
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/jsc/bindings/c-bindings.cpp` around lines 1016 - 1022, The code resets
Bun__currentSyncPID before checking signalForwardingDepth, which clears the
forwarded PID prematurely for nested spawnSync calls; modify
Bun__unregisterSignalsForForwarding so it acquires signalForwardingLock,
decrements signalForwardingDepth, and only when signalForwardingDepth reaches 0
does it clear Bun__currentSyncPID and return—leave Bun__currentSyncPID unchanged
when depth stays > 0 to ensure outer calls still forward signals.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/jsc/bindings/c-bindings.cpp`:
- Around line 1016-1022: The code resets Bun__currentSyncPID before checking
signalForwardingDepth, which clears the forwarded PID prematurely for nested
spawnSync calls; modify Bun__unregisterSignalsForForwarding so it acquires
signalForwardingLock, decrements signalForwardingDepth, and only when
signalForwardingDepth reaches 0 does it clear Bun__currentSyncPID and
return—leave Bun__currentSyncPID unchanged when depth stays > 0 to ensure outer
calls still forward signals.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 465df2fa-68f7-42bb-8adf-a435cfd0251b

📥 Commits

Reviewing files that changed from the base of the PR and between 2d5b42e and 7e2c283.

📒 Files selected for processing (1)
  • src/jsc/bindings/c-bindings.cpp

@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented May 18, 2026

CI on 7e2c283 (the reentrancy fix): 73/74 checks pass. The one failure is darwin-14-aarch64-test-bun with status Expired — the job timed out waiting for an agent and never ran. All 15 other test-bun jobs passed, including darwin-26-aarch64 (same arch, newer macOS) and every Linux target where the new spawnSync-sigpwr-gc test runs.

No test failures on this build; only auto-retried flakes in the warning annotations (hot.test.ts, spawn-stdin-readable-stream.test.ts, spawn.test.ts timeout — all on Windows and all pre-existing).

Already used my one retrigger on build 55649; not pushing another. The expired darwin-14-aarch64 job just needs a retry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants