Skip to content

lexer: step past unknown byte on TSyntaxError in next_inside_jsx_element#30969

Merged
Jarred-Sumner merged 8 commits into
mainfrom
farm/0ae593db/fix-30959-scope-loc-assert
May 18, 2026
Merged

lexer: step past unknown byte on TSyntaxError in next_inside_jsx_element#30969
Jarred-Sumner merged 8 commits into
mainfrom
farm/0ae593db/fix-30959-scope-loc-assert

Conversation

@robobun
Copy link
Copy Markdown
Collaborator

@robobun robobun commented May 18, 2026

Fixes #30959.

Repro

$ echo '<r L=((' > /tmp/bug.tsx
$ bun-debug build /tmp/bug.tsx

panic: Scope location must be greater than previous
6 must be greater than 6 (src/bun_core/output.rs:1053:9)

Release builds print the two parse errors and recover cleanly; only debug builds trip the assertion.

Root cause

In next_inside_jsx_element the TSyntaxError branch sets end = current but never advances current / code_point past the bad byte — every other branch in that switch calls step() which leaves current = end + 1, the invariant the rest of next() depends on. The TSyntaxError branch violates it.

For <r L=(( the lexer reports start=5, end=6, current=6, code_point=0x28 (= the first () after tokenising it as TSyntaxError. The subsequent p.lexer.expect(T::TOpenBrace) in parse_jsx_prop_value_identifier:

  1. logs Expected "{" but found "(" via add_range_error (returns Ok),
  2. calls self.next(), which re-dispatches on the un-stepped code_point=0x28, matches (TOpenParen, and step_with reads contents[current=6] (= the second (), updates end = 6, current = 7, code_point = 0x28. Final state: start=6 end=6 current=7 tok=TOpenParen — a zero-length TOpenParen at offset 6.

pfx_t_open_paren then captures loc=6, calls next() one more time which advances end to 7 and re-tokenises the same ( as a real TOpenParen also at start=6. parse_paren_expr pushes two FunctionArgs scopes at offset 6 and the debug assertion in push_scope_for_parse_pass fires.

Fix

Call self.step() immediately after setting TSyntaxError in next_inside_jsx_element. end was already advanced on the line above, so the error range [start, end) is unchanged and the Expected "{" but found "(" diagnostic still points at the right source position. Subsequent next() dispatches on the next byte and emits one correct token, so the parser never sees a duplicate scope offset and the push_scope_for_parse_pass assertion keeps its original strict form.

Verification

test/regression/issue/jsx-template-string-crash.test.ts — new #30959 case asserts that the release-mode error output (Expected "{" but found "(" + Unexpected }) is produced in debug mode too. Fails with a 5s timeout on main (the panic hangs the process); passes with the fix.

The existing JSX lexer should not crash with slice bounds issues regression snapshot in the same file shifts from Unexpected > at col 37 to Unterminated string literal at col 35 for <div a=``/>. That shift is behavioural: the recovery lexer no longer re-dispatches on the first backtick, so the second backtick is now what gets tokenised as the template opener (which then runs to EOF — "unterminated"). The new diagnostic points at the actual unterminated string rather than the later >; both are valid parse errors for the same malformed input.

Transpiler / bundler JSX suites (test/bundler/transpiler/transpiler.test.js, test/bundler/bundler_jsx.test.ts) still pass.

Parsing `<r L=((` in a debug build panicked with "Scope location must
be greater than previous" in push_scope_for_parse_pass.

In the JSX prop-value recovery path, next_inside_jsx_element returns
TSyntaxError on the first '(' without stepping past it. The subsequent
expect(TOpenBrace) logs the error and calls next(), which then
mis-dispatches on the still-in-code_point '(' and produces a
zero-length TOpenParen at the same source offset. parse_paren_expr
then pushes two FunctionArgs scopes at that offset, tripping the
strict-monotonicity debug assertion.

Release builds already reported both parse errors and recovered
cleanly; the debug check was stricter than release. Skip the
scope-location monotonicity assertion once an error has been logged,
so debug matches release on error-recovery paths.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Lexer now advances past an invalid byte after emitting a JSX syntax error to avoid reprocessing the same input. Tests updated: one snapshot extended and one new concurrent regression test added for issue #30959 asserting two parse errors, empty stdout, and exit code 1.

Changes

JSX Lexer Error Recovery Fix

Layer / File(s) Summary
Lexer error recovery advancement in JSX
src/js_parser/lexer.rs
After emitting T::TSyntaxError for an unexpected byte in JSX content, the lexer calls self.step() to advance code_point/current past the bad byte, ensuring the next next() call processes the following byte instead of retrying the same one; the original error range is preserved.
Regression tests for JSX parsing errors
test/regression/issue/jsx-template-string-crash.test.ts
Existing "JSX lexer should not crash with slice bounds issues" test snapshot updated to include an additional Unterminated string literal parse error. New concurrent test for issue #30959 spawns Bun with malformed JSX inline snippet, normalizes stderr to remove version lines, asserts two expected parse errors via snapshot, and verifies empty stdout and exit code 1.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'lexer: step past unknown byte on TSyntaxError in next_inside_jsx_element' directly describes the code change and fix location, accurately summarizing the main technical modification.
Description check ✅ Passed The description comprehensively covers the root cause, fix details, and verification steps; it directly addresses both required template sections (what the PR does and verification approach).
Linked Issues check ✅ Passed The PR fully addresses issue #30959 by fixing the debug assertion panic in push_scope_for_parse_pass when parsing invalid JSX with nested parentheses, implementing the required lexer step() call and adding regression tests.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the reported issue: lexer fix in next_inside_jsx_element, new regression test, and expected snapshot update for existing JSX test case are all directly related to resolving the scope location assertion.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented May 18, 2026

Updated 1:42 AM PT - May 18th, 2026

@robobun, your commit e49af9ce3f0d07044d959d929fec5ba80fd67fee passed in Build #55717! 🎉


🧪   To try this PR locally:

bunx bun-pr 30969

That installs a local version of the PR into your bun-30969 executable, so you can run:

bun-30969 --bun

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/regression/issue/jsx-template-string-crash.test.ts`:
- Around line 32-67: This regression test named 'test("`#30959` JSX attribute with
invalid '(' value parses cleanly in debug builds"...' belongs in the
issue-numbered regression tests; move the test block out of
jsx-template-string-crash.test.ts and place it into a new issue-numbered test
file named 30959.test.ts (or split accordingly) following the
test/regression/issue/* convention, keeping the test body, assertions
(normalizeBunSnapshot, expect stdout/exitCode) and Bun.spawn invocation
unchanged so CI recognizes it as a targeted `#30959` regression test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 96219220-44d8-43d6-b3fb-525157d5a4f4

📥 Commits

Reviewing files that changed from the base of the PR and between 9ecb985 and 3c9a387.

📒 Files selected for processing (2)
  • src/js_parser/p.rs
  • test/regression/issue/jsx-template-string-crash.test.ts

Comment thread test/regression/issue/jsx-template-string-crash.test.ts Outdated
Comment thread src/js_parser/p.rs Outdated
Comment thread test/regression/issue/jsx-template-string-crash.test.ts Outdated
Review feedback (#30969):

- `has_errors()` checks the absolute error count on the shared `Log`,
  which may already carry errors from earlier files in the same parse
  session. Capture the baseline at `P::init` (`orig_error_count`) and
  compare `log.errors > self.orig_error_count` instead — mirrors the
  `orig_error_count` pattern at `parse_entry.rs:762`/`:876` that
  already gates the visit-pass bailout. Without this, an unrelated error
  from an earlier file would silently disable the debug-only
  scope-location invariant for every subsequent file in the session.

- Mark the new #30959 regression test as `test.concurrent` per
  `test/CLAUDE.md` — it spawns an async subprocess and shares no
  state with the pre-existing test in the file.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/regression/issue/jsx-template-string-crash.test.ts`:
- Line 54: The expectation calls .toString() on stderr which is already a string
(from proc.stderr.text()); remove the redundant .toString() call in the expect
line that invokes normalizeBunSnapshot so it operates directly on stderr (e.g.,
change the argument to normalizeBunSnapshot to use stderr.replace(...)) while
keeping the replace regex and toMatchInlineSnapshot intact; update the test line
around the call to normalizeBunSnapshot/expect to use stderr directly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: aa55a4ee-1152-45d1-95ef-6d8925e5912d

📥 Commits

Reviewing files that changed from the base of the PR and between 3c9a387 and bc00e5d.

📒 Files selected for processing (2)
  • src/js_parser/p.rs
  • test/regression/issue/jsx-template-string-crash.test.ts

Comment thread test/regression/issue/jsx-template-string-crash.test.ts Outdated
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — both nits from the previous pass are addressed (delta comparison against orig_error_count, test.concurrent), and the change is debug-only.

Extended reasoning...

Overview

Relaxes the strictly-increasing scope-location debug assertion in push_scope_for_parse_pass (src/js_parser/p.rs) so it no longer fires once the current file has logged a parse error. Adds a u32 orig_error_count field on P, initialized at P::init from the shared Log, and gates the assertion on log.errors > orig_error_count. A regression test in test/regression/issue/jsx-template-string-crash.test.ts asserts the release-mode error output for the <r L=(( repro.

Security risks

None. The entire modified block is inside if cfg!(debug_assertions); release builds are byte-for-byte unaffected. No input handling, auth, or data-exposure surface is touched.

Level of scrutiny

Low. The change only relaxes a debug assertion, and only on the error-recovery path — the worst possible regression is marginally reduced first-pass assertion coverage after an error is already logged, and any real scope mismatch would still be caught by push_scope_for_visit_pass. The new orig_error_count field is a plain u32 read once at construction; the unsafe { log.as_ref() } mirrors the existing P::log() accessor pattern in the same file and the identical read at parse_entry.rs:762.

Other factors

My two prior nits (compare against a per-parse baseline rather than has_errors(); mark the new test test.concurrent) were both applied in bc00e5d and are visible in the final diff. CodeRabbit's suggestion to move the test to an issue-numbered file was declined with a reasonable rationale (fuzzer-found, never-worked bug ≠ regression per test/CLAUDE.md) and resolved. The fix mirrors the established orig_error_count delta pattern used at the visit-pass bailout (parse_entry.rs:876, parse_entry.zig:439), and the regression test pins the expected error output.

Copy link
Copy Markdown
Collaborator

@Jarred-Sumner Jarred-Sumner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this fix is correct. Diff the control flow compared to the Zig version and figure out why it's broken here. My guess is its related to restoring lexer state when parsing typescript fails.

Jarred pointed out (#30969) that the previous fix — skipping the
debug assertion once the file logs an error — was hiding a real bug
rather than fixing it.

The root cause is in next_inside_jsx_element's TSyntaxError branch.
On an unknown character it sets `end = current` but leaves
`code_point` holding that same byte, so `current` never advances
past it. When the caller recovers via `expect().next()`, the main
lexer's next() re-dispatches on that still-in-code_point byte, using
the updated `end` as the new `start`. step_with() then reads the
*following* byte (the second '(' in `<r L=((`) and assigns it to
end/code_point, producing a zero-length TOpenParen at `start=end=6`.
pfx_t_open_paren captures `loc=6`, calls next() once more which
advances end to 7 and re-tokenises the same '(' as a real TOpenParen
at `start=6` again, and parse_paren_expr ends up pushing two
FunctionArgs scopes at offset 6 — tripping the strict-monotonicity
debug assertion in push_scope_for_parse_pass.

Fix: call step() immediately after setting TSyntaxError so
code_point/current advance past the bad byte. The error range
`[start, end)` is unchanged (end was already set a line above), so
the "Expected { but found (" diagnostic still points at the right
source position. Subsequent `next()` now starts from the *next*
byte and emits one correct token instead of two at the same offset.

Release behaviour is identical on valid input (the TSyntaxError
branch is only taken when the JSX lexer sees a character it can't
tokenise). On the fuzzer-found input `<r L=((` debug now matches
release and emits `Expected "{" but found "("` followed by
`Unexpected end of file`.

The pre-existing `<div a=\`\`/>` regression snapshot in
jsx-template-string-crash.test.ts shifts from `Unexpected >` at
col 37 to `Unterminated string literal` at col 35 — a more accurate
diagnostic (the second backtick is what the recovery lexer treats as
a template opener, which never terminates) rather than a behavioural
regression.

Reverts the `orig_error_count` workaround on P.
@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented May 18, 2026

Thanks for the nudge — you were right, the assertion workaround was masking the real bug.

Diffed the control flow step-by-step against the Zig version and the divergence isn't in snapshot/restore around is_ts_arrow_fn_jsx (that path runs and restores cleanly in the trace — next() advances through r/L, returns is_ts_arrow_fn=false, restore puts the lexer back to exactly start=0 end=1 current=2 cp=0x72 tok=TLessThan). The bug is one level down, in the TSyntaxError branch of next_inside_jsx_element:

self.end = self.current;
self.token = T::TSyntaxError;

After this the lexer reports start=5, end=6, current=6, code_point=0x28 for the first ( in <r L=((. end = currentcurrent was not advanced, so code_point still holds the bad byte. Every other branch in that switch calls step() which leaves current = end + 1 (the normal invariant after tokenising one byte). TSyntaxError is the only one that violates it.

expect(TOpenBrace) then logs the error and calls next() — the main lexer next(), which dispatches on code_point=0x28 again (matches (T::TOpenParen), sets start = end = 6, and step_with reads contents[current=6] (the second (), updates end = 6, current = 7, code_point = 0x28. Final state: start=6 end=6 current=7 tok=TOpenParen — a zero-length TOpenParen at offset 6. pfx_t_open_paren captures loc=6, calls next() one more time which advances end to 7 and re-tokenises the same byte as a real TOpenParen also at start=6. parse_paren_expr pushes two FunctionArgs scopes at offset 6 and the debug assertion fires.

d1a40fc fixes it at the right layer: call step() after setting TSyntaxError, so code_point/current advance past the bad byte and the error range [start, end) is unchanged (end was already advanced a line above). The recovery next() now dispatches on the following byte, emits one correct token, and the parser never sees a duplicate scope offset.

Reverted the orig_error_count workaround on P — the debug assertion is back to its strict form.

The pre-existing <div a=\`/>regression snapshot injsx-template-string-crash.test.tsshifts fromUnexpected >at col 37 toUnterminated string literalat col 35. The second backtick is what the recovery lexer now treats as a template opener (it previously processed the same bad-byte position twice), and the template runs to EOF — arguably a more accurate diagnostic than pointing at the unrelated>`, but if you'd prefer to preserve the old error I can scope the step to only fire when the caller is inside JSX-attribute recovery.

Still passes the local transpiler / jsx-bundler / jsx-template suites. Test still fails with a 5s timeout on bc00e5d~1 and passes at d1a40fcb.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
test/regression/issue/jsx-template-string-crash.test.ts (1)

59-59: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Remove redundant .toString() call.

stderr is already a string from proc.stderr.text() (line 57), so calling .toString() is unnecessary.

♻️ Proposed fix
-  expect(normalizeBunSnapshot(stderr.toString().replace(/(Bun v.*)$/gm, ""))).toMatchInlineSnapshot(`
+  expect(normalizeBunSnapshot(stderr.replace(/(Bun v.*)$/gm, ""))).toMatchInlineSnapshot(`
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/regression/issue/jsx-template-string-crash.test.ts` at line 59, The test
calls .toString() on stderr unnecessarily—stderr is already a string (from
proc.stderr.text()); update the assertion in the expect call that uses
normalizeBunSnapshot so it calls replace directly on stderr (e.g., change
normalizeBunSnapshot(stderr.toString().replace(...)) to
normalizeBunSnapshot(stderr.replace(...))) while keeping the same regex and the
normalizeBunSnapshot invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@test/regression/issue/jsx-template-string-crash.test.ts`:
- Line 59: The test calls .toString() on stderr unnecessarily—stderr is already
a string (from proc.stderr.text()); update the assertion in the expect call that
uses normalizeBunSnapshot so it calls replace directly on stderr (e.g., change
normalizeBunSnapshot(stderr.toString().replace(...)) to
normalizeBunSnapshot(stderr.replace(...))) while keeping the same regex and the
normalizeBunSnapshot invocation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3f8e0149-2119-45df-8baa-810e7e6fdfff

📥 Commits

Reviewing files that changed from the base of the PR and between d1a40fc and 72597fc.

📒 Files selected for processing (1)
  • test/regression/issue/jsx-template-string-crash.test.ts

Comment thread test/regression/issue/jsx-template-string-crash.test.ts Outdated
@robobun robobun changed the title js_parser: tolerate duplicate scope locations after error recovery lexer: step past unknown byte on TSyntaxError in next_inside_jsx_element May 18, 2026
stderr is already a string from proc.stderr.text(). Per coderabbit.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/regression/issue/jsx-template-string-crash.test.ts (1)

17-30: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Move exitCode assertion to the end of the first test.

expect(exitCode).toBe(1) is currently asserted before output checks, which weakens failure diagnostics compared to the house style used elsewhere in this file.

♻️ Proposed fix
-  expect(exitCode).toBe(1);
   expect(normalizeBunSnapshot(stderr.toString().replace(/(Bun v.*)$/gm, ""))).toMatchInlineSnapshot(`
@@
   `);
   expect(normalizeBunSnapshot(stdout.toString())).toMatchInlineSnapshot(`""`);
+  expect(exitCode).toBe(1);

As per coding guidelines: “When spawning processes in tests, expect stdout before expecting exit code for more useful error messages” and “Assert the exit code last in tests - this gives a more useful error message on test failure”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/regression/issue/jsx-template-string-crash.test.ts` around lines 17 -
30, The exitCode assertion is placed before the output assertions; move the line
"expect(exitCode).toBe(1)" to the end of the first test so stdout/stderr checks
run first — specifically, after the two
expect(normalizeBunSnapshot(...stderr...)).toMatchInlineSnapshot(...) and
expect(normalizeBunSnapshot(stdout.toString())).toMatchInlineSnapshot(`""`)
assertions; update the test containing those calls so exitCode is asserted last
to follow the file's house style.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@test/regression/issue/jsx-template-string-crash.test.ts`:
- Around line 17-30: The exitCode assertion is placed before the output
assertions; move the line "expect(exitCode).toBe(1)" to the end of the first
test so stdout/stderr checks run first — specifically, after the two
expect(normalizeBunSnapshot(...stderr...)).toMatchInlineSnapshot(...) and
expect(normalizeBunSnapshot(stdout.toString())).toMatchInlineSnapshot(`""`)
assertions; update the test containing those calls so exitCode is asserted last
to follow the file's house style.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 53e6e9fd-fbb7-4f18-82bd-815243fe30fa

📥 Commits

Reviewing files that changed from the base of the PR and between 72597fc and b3e0a97.

📒 Files selected for processing (1)
  • test/regression/issue/jsx-template-string-crash.test.ts

Comment thread src/js_parser/lexer.rs
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one-line step() fix is well-reasoned and all prior feedback is addressed, but since this changes lexer error-recovery behavior (visible in the shifted <div a=\`/>` snapshot you flagged as a judgment call) and Build #55696 reported failures before the retrigger, I'd like a human to sign off.

Extended reasoning...

Overview

This PR adds a single self.step() call (plus an explanatory comment) to the TSyntaxError fallthrough arm of next_inside_jsx_element in src/js_parser/lexer.rs, fixing a debug-only panic where the lexer failed to advance past an invalid byte and the parser subsequently pushed two scopes at the same offset. The test file gains a new #30959 regression case and updates the existing <div a=\`/>` snapshot, whose diagnostic shifted from "Unexpected >" at col 37 to "Unterminated string literal" at col 35.

Security risks

None. This is an error-recovery path in the JSX lexer that only fires on already-malformed input; the change advances a cursor by one codepoint. No auth, crypto, permissions, or untrusted-data handling is involved.

Level of scrutiny

Moderate-to-high. The Rust JS lexer is a performance-critical, correctness-critical state machine that every parsed file flows through. Although the change is one line on an error path and release builds were already unaffected, it produces a user-visible behavioral shift in error diagnostics for a different malformed input (the pre-existing backtick test). The author explicitly flagged this as a judgment call ("if you'd prefer to preserve the old error I can scope the step to only fire when the caller is inside JSX-attribute recovery") — that's exactly the kind of design tradeoff a human maintainer should weigh in on rather than a bot.

Other factors

  • All three of my prior inline nits were addressed (orig_error_count → reverted in favor of the proper lexer fix; test.concurrent; stale comment wording). My remaining 05:47 comment about the parallel pattern at lexer.rs:2234 is explicitly pre-existing/non-blocking.
  • The PR went through a significant approach change mid-review (assertion-relaxation → root-cause lexer fix), which is the right outcome but means the final shape hasn't had human eyes on it.
  • robobun reported failures in Build #55696 against b3e0a97; the most recent commit (0aad3ff) is a CI retrigger whose result I can't see. I'd want green CI confirmed before this lands.
  • No CODEOWNERS entry covers src/js_parser/.
  • I verified step()next_codepoint_with() sets end = current before advancing, so the new call leaves the error range [start, end) unchanged as the comment claims, including at EOF and for multibyte codepoints.

@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented May 18, 2026

CI status: diff is green on all lanes that exercise the parser (debian-13 x64/aarch64/asan/baseline, ubuntu-25-aarch64, all musl test lanes, all windows/darwin build lanes, the verify-baseline x64-musl + windows variants). The two still-red lanes are unrelated to the JSX lexer:

  • linux-x64-baseline-verify-baseline (build #55696): could not download file from https://static.rust-lang.org/dist/.../channel-rust-nightly.toml — Rust toolchain CDN hiccup during cargo-build of the verifier binary. The musl and windows variants of the same step passed on the same build.
  • darwin-14-x64-test-bun (build #55699, post-retrigger): test/js/node/http/node-http-backpressure-max.test.ts and test/js/bun/test/expect-assertions.test.ts — HTTP backpressure test timed out 4x at 60s, bun:test expect-assertions counting. The log also shows ERROR: Cannot connect to the Docker daemon at unix:///var/run/docker.sock on the same host. Neither test goes through the JSX lexer path I changed.

I used my one re-roll on the Rust-CDN failure; per the farm policy I should not push another ci: retrigger. Needs a maintainer to kick the darwin-14 lane or merge once it settles.

Mirrors d1a40fc at the parallel site in the main `next()`
fallthrough (lexer.rs:2234-2235). Per Jarred on PR #30969 — the same
pattern lived at both sites and the Rust port diverged from the Zig
sibling at one of them, leaving the main lexer with `current == end`
after the error instead of the `current > end` invariant every other
branch maintains.

The concrete duplicate-scope panic the JSX site triggered isn't
currently reachable from the main lexer (a byte that falls through to
this arm is invalid in main-lexer context too, so recovery stays in
`TSyntaxError` rather than manufacturing a real token), but keeping
the invariant consistent across both dispatch tables means future
recovery paths don't have to reason about one arm that leaves the
lexer half-advanced.
@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented May 18, 2026

CI status update: all parser-exercising lanes green (72/73 pass). The one red lane on build #55717 is darwin-14-aarch64-test-bun with state broken / Expired — the Buildkite agent never picked up the job (no log at the job URL, exit_status=None). Infrastructure, not a test failure; the aarch64 test matrix is also covered by debian-13-aarch64-test-bun and ubuntu-25-04-aarch64-test-bun, both of which passed.

I already spent my re-roll on the previous Rust-CDN hiccup + darwin-14-x64 timeout; per the farm policy I'm not pushing another ci: retrigger. Needs a maintainer to kick the darwin-14-aarch64 lane or merge once it settles.

@Jarred-Sumner Jarred-Sumner merged commit 4c8a21b into main May 18, 2026
77 checks passed
@Jarred-Sumner Jarred-Sumner deleted the farm/0ae593db/fix-30959-scope-loc-assert branch May 18, 2026 08:48
Jarred-Sumner pushed a commit that referenced this pull request May 18, 2026
…ent (#30969)

Fixes #30959.

## Repro

```console
$ echo '<r L=((' > /tmp/bug.tsx
$ bun-debug build /tmp/bug.tsx

panic: Scope location must be greater than previous
6 must be greater than 6 (src/bun_core/output.rs:1053:9)
```

Release builds print the two parse errors and recover cleanly; only
debug builds trip the assertion.

## Root cause

In `next_inside_jsx_element` the `TSyntaxError` branch sets `end =
current` but never advances `current` / `code_point` past the bad byte —
every *other* branch in that switch calls `step()` which leaves `current
= end + 1`, the invariant the rest of `next()` depends on. The
`TSyntaxError` branch violates it.

For `<r L=((` the lexer reports `start=5, end=6, current=6,
code_point=0x28` (= the first `(`) after tokenising it as
`TSyntaxError`. The subsequent `p.lexer.expect(T::TOpenBrace)` in
`parse_jsx_prop_value_identifier`:

1. logs `Expected "{" but found "("` via `add_range_error` (returns
`Ok`),
2. calls `self.next()`, which re-dispatches on the un-stepped
`code_point=0x28`, matches `(` → `TOpenParen`, and `step_with` reads
`contents[current=6]` (= the **second** `(`), updates `end = 6`,
`current = 7`, `code_point = 0x28`. Final state: `start=6 end=6
current=7 tok=TOpenParen` — a **zero-length** `TOpenParen` at offset 6.

`pfx_t_open_paren` then captures `loc=6`, calls `next()` one more time
which advances `end` to 7 and re-tokenises the same `(` as a real
`TOpenParen` also at `start=6`. `parse_paren_expr` pushes two
`FunctionArgs` scopes at offset 6 and the debug assertion in
`push_scope_for_parse_pass` fires.

## Fix

Call `self.step()` immediately after setting `TSyntaxError` in
`next_inside_jsx_element`. `end` was already advanced on the line above,
so the error range `[start, end)` is unchanged and the `Expected "{" but
found "("` diagnostic still points at the right source position.
Subsequent `next()` dispatches on the *next* byte and emits one correct
token, so the parser never sees a duplicate scope offset and the
`push_scope_for_parse_pass` assertion keeps its original strict form.

## Verification

`test/regression/issue/jsx-template-string-crash.test.ts` — new `#30959`
case asserts that the release-mode error output (`Expected "{" but found
"("` + `Unexpected }`) is produced in debug mode too. Fails with a 5s
timeout on main (the panic hangs the process); passes with the fix.

The existing `JSX lexer should not crash with slice bounds issues`
regression snapshot in the same file shifts from `Unexpected >` at col
37 to `Unterminated string literal` at col 35 for `<div a=``/>`. That
shift is behavioural: the recovery lexer no longer re-dispatches on the
first backtick, so the second backtick is now what gets tokenised as the
template opener (which then runs to EOF — "unterminated"). The new
diagnostic points at the actual unterminated string rather than the
later `>`; both are valid parse errors for the same malformed input.

Transpiler / bundler JSX suites
(`test/bundler/transpiler/transpiler.test.js`,
`test/bundler/bundler_jsx.test.ts`) still pass.

---------

Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invalid parentheses expression leads to debug assertion on scope location

2 participants