perf: avoid throw-as-control-flow in SessionManager hot path (#7756) by JohnMcLear · Pull Request #7775 · ether/etherpad

JohnMcLear · 2026-05-16T07:48:57Z

Summary

CPU profile of the SUT at the 100→400 author dive sweep (load-test workflow run 25956384097) attributed ~6% of total process CPU to the throw + catch around getSessionInfo:

~1.82% to new CustomError('sessionID does not exist', 'apierror') (stack-trace capture)
~4.12% downstream, via the catch block's console.debug(...) routed through log4js → sendToListeners → sendLogEventToAppender

Both internal callers (findAuthorID on every CLIENT_READY, listSessionsWithDBKey on session listing) caught apierror and discarded it. The public exports.getSessionInfo still has to throw for the HTTP API (returns code: 1 for missing sessionID), so this PR introduces a private getSessionInfoOrNull helper that returns null and switches the two hot-path callers to it. exports.getSessionInfo becomes a thin wrapper that preserves the throw semantics.

Profile evidence

Inverted callers of CustomError constructor in the profile:

 1.82% exports.getSessionInfo [src/node/db/SessionManager.ts]

Inverted (non-log4js) callers of log4js Logger.<computed>:

 4.12% exports.checkAccess [src/node/db/SecurityManager.ts]

(checkAccess is the entry point that drives findAuthorID → getSessionInfo → console.debug.)

Behavior

No behavior change for the HTTP API. getSessionInfo still throws `apierror` when the session doesn't exist; RestAPI.ts/APIHandler.ts translate that to code: 1.
The two internal callers preserve their existing semantics: findAuthorID returns undefined for unknown sessions, listSessionsWithDBKey continues to log a warning and set sessions[sessionID] = null.

Test plan

`pnpm exec mocha tests/backend/specs/api/sessionsAndGroups.ts` — all 32 cases pass, including "getSessionInfo of deleted session" (still expects code:1).
Re-run the dive sweep with this branch as core_ref and confirm steady-state CPU% drops at the cliff (300-400 authors).

Part of #7756. Profile capture pipeline is in etherpad-load-test#109/110/111.

CPU profile of the SUT at the 100-400 author dive sweep (load-test workflow run 25956384097) attributed about 6% of total process CPU to the throw + catch around getSessionInfo: - ~1.82% to `new CustomError('sessionID does not exist', 'apierror')` construction (stack trace capture) - ~4.12% downstream, via the catch block's `console.debug(...)` routed through log4js -> sendToListeners -> sendLogEventToAppender Both call sites (`findAuthorID` on every CLIENT_READY, and `listSessionsWithDBKey` on session listing) immediately caught `apierror` and discarded it. The public `exports.getSessionInfo` contract still has to throw for the HTTP API (returning code:1 for missing sessionID), so introduce a private `getSessionInfoOrNull` helper that returns null and have the hot-path callers use it directly. `exports.getSessionInfo` is kept as a thin wrapper that preserves the existing throw semantics. No behaviour change for the HTTP API — sessionsAndGroups.ts test file (32 cases, including "getSessionInfo of deleted session" expecting code:1) passes unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

qodo-code-review · 2026-05-16T07:49:01Z

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

qodo-free-for-open-source-projects · 2026-05-16T07:49:10Z

Review Summary by Qodo

Avoid throw-as-control-flow in SessionManager hot path

✨ Enhancement

Walkthroughs

Description

• Eliminate throw-as-control-flow in hot-path callers of getSessionInfo
• Introduce private getSessionInfoOrNull helper returning null
• Refactor findAuthorID and listSessionsWithDBKey to use null checks
• Reduce CPU overhead by ~6% (1.8% CustomError construction + 4% logging)

Diagram

flowchart LR
  A["getSessionInfoOrNull<br/>returns null"] --> B["findAuthorID<br/>null check"]
  A --> C["listSessionsWithDBKey<br/>null check"]
  D["exports.getSessionInfo<br/>wrapper"] --> E["throws apierror<br/>for HTTP API"]
  A --> D

File Changes

1. src/node/db/SessionManager.ts Performance optimization +20/-22

Replace throw-catch with null-check pattern

• Introduce private getSessionInfoOrNull helper that returns null instead of throwing
• Refactor findAuthorID to use null-returning helper with simple || check
• Refactor listSessionsWithDBKey to use null-returning helper with explicit null check
• Keep exports.getSessionInfo as thin wrapper preserving throw semantics for HTTP API

src/node/db/SessionManager.ts

qodo-free-for-open-source-projects · 2026-05-16T07:49:11Z

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0)

Great, no issues found!

Qodo reviewed your code and found no material issues that require review

CPU profile of develop at the 100-400 author dive sweep (load-test run 25956384097) identified a ~6% process-CPU win in SessionManager: throw-as-control-flow on every CLIENT_READY session lookup. Add lever 9 section with the profile evidence, link the open PR (#7775), and add a "Other CPU hotspots surfaced" subsection documenting findings not yet acted on (Changeset internals, appendRevision, ueberdb/dirty backing as test-harness artifact, esbuild __name overhead). Update Recommendation to include #7775 as the highest-priority merge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the "score pending" placeholder under lever 9 with the actual numbers from runs 25957107195/25957108328/25957109418 (perf branch) vs 25954537767/25954538807/25954540108 (develop), both at authors=100..500:step=50:dwell=8s:warmup=2s. Result: consistent -1.4% to -5.3% CPU reduction across all 9 steps, matching profile direction at 2-5% (vs 6% profile-attributed upper bound). Latency delta sits inside the noise envelope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

JohnMcLear · 2026-05-16T08:34:47Z

N=3 measured impact

Three perf-branch runs (25957107195, 25957108328, 25957109418) compared against three develop baselines (25954537767, 25954538807, 25954540108) — same authors=100..500:step=50:dwell=8s:warmup=2s sweep on each.

Medians:

step	dev CPU%	perf CPU%	ΔCPU%
100	4.76	4.67	-1.7%
150	9.09	8.95	-1.5%
200	15.21	14.60	-4.0%
250	21.51	21.32	-0.9%
300	30.46	29.68	-2.6%
350	41.58	39.36	-5.3%
400	56.26	54.23	-3.6%
450	72.33	70.49	-2.5%
500	88.38	87.14	-1.4%

ΔCPU% is consistently negative (-1.4% to -5.3%) across all 9 steps. The realised magnitude is below the profile-attributed 6% upper bound because some of the log4js cost the profile attributed to the throw path was unrelated info logging — but the direction is exactly what the profile predicted.

Latency delta sits inside the noise envelope across all steps (raw p95 triples overlap heavily run-to-run).

Three combined-branch runs (perf/dive-combined = #7776 cherry-picked onto #7775 base; runs 25960003164/25960004223/25960005248) vs the same three develop baselines: -12% to -20% CPU% across all 9 sweep steps, with the p95 cliff effectively moving from ~400 to ~500 authors (at step 400, two of three combined runs land below the cliff at 45ms and 112ms p95 vs develop [1758, 2275, 2463]). Adds: - Lever 10 section for #7776 with its own N=3 numbers (-3.6 to -8.9% alone). - "Stacking" section showing super-additive interaction. - Local vCPU experiment showing the cliff is single-event-loop-bound, not total-CPU-bound: 4-core and 8-core pinned SUTs hit the same cliff at the same step. - Updated TL;DR, Recommendation order (merge both #7775+#7776 first), and "Where to take this next" with worker-thread offload as the smallest next architectural step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

JohnMcLear · 2026-05-16T11:04:39Z

#7775 + #7776 stacked — N=3 combined measurement

I cherry-picked #7775 onto #7776 (branch perf/dive-combined) and ran 3 cliff sweeps (25960003164, 25960004223, 25960005248) against the same 3 develop baselines used for the individual PR scoring. The two fixes stack super-additively:

step	dev CPU%	#7775 alone	#7776 alone	both	Δboth
100	4.76	4.67	4.51	3.99	-16.1%
200	15.21	14.60	14.33	12.48	-17.9%
300	30.46	29.68	28.50	24.39	-19.9%
350	41.58	39.36	37.87	33.04	-20.5%
400	56.26	54.23	53.67	44.78	-20.4%
450	72.33	70.49	68.80	61.18	-15.4%
500	88.38	87.14	85.17	77.70	-12.1%

The stacked impact (-12% to -20% CPU%) is well above the sum of the individual gains. Both fixes remove call sites feeding the same log4js cluster-mode dispatch chain (sendToListeners → sendLogEventToAppender); halving the LogEvent allocation rate appears to relieve queue/GC pressure beyond what either fix accounts for in isolation.

Latency: cliff has effectively moved past step 400. Raw p95 triples:

step	develop p95 [3 runs]	combined p95 [3 runs]
400	[1758, 2275, 2463]	[45, 112, 634]
450	[5415, 6167, 6611]	[3297, 3719, 3897] (-40%)
500	[10655, 11759, 12183]	[8091, 8711, 9127] (-26%)

At step 400, two of three combined runs land below the cliff entirely. Recommend landing these two together to capture the full effect.

Post-#7775/#7776 profile shows applyToAText splits cleanly: - applyToText (Changeset.ts:404) is pure (cs, text) -> text; trivially offloadable to a worker via worker_threads structured-clone postMessage. - applyToAttribution (Changeset.ts:684) mutates AttributePool; not trivially offloadable. Document the obvious first-pass design (run them in parallel via Promise.all inside applyToAText) and the realistic estimate (~6-8% CPU moved off the main event loop). putAttrib is only 0.26% in the post-fix profile, confirming the bulk of applyToAText's cost is in the string-manipulation half. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add a "Roadmap for future effort" section ahead of Reproducing, ranking the next concrete options by impact-per-time-spent. Tier 1 (mechanical / <1 day each): - merge ready perf PRs (#7775+#7776+#7774) - implement #7780 room-broadcast fan-out - additional post-fix profile pass Tier 2 (medium, real cliff moves): - selective fan-out / viewport-based broadcast (~2 weeks; cliff ~500 → 1000-1500) - per-pad worker isolation PoC (~1-2 weeks PoC, 1-2 months prod) Tier 3 (large bets): - sticky-session cluster mode (~2-4 weeks PoC) - CRDT migration (months; anti-recommended) Tier 4 (operational): - production telemetry hookup (~3-5 days) - nightly dive in CI (~1 day) Records the recommended sequence (Tier 1.2 → Tier 2.4) so the next person picking this up doesn't need to re-derive it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

JohnMcLear mentioned this pull request May 16, 2026

perf: don't log settings.loadTest warning per-message (#7756) #7776

Draft

3 tasks

JohnMcLear mentioned this pull request May 16, 2026

Performance: [#7756 follow-up] Room-broadcast NEW_CHANGES fan-out to drop ~5-7% per-recipient packet construction #7780

Open

JohnMcLear marked this pull request as draft May 18, 2026 12:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: avoid throw-as-control-flow in SessionManager hot path (#7756)#7775

perf: avoid throw-as-control-flow in SessionManager hot path (#7756)#7775
JohnMcLear wants to merge 1 commit into
developfrom
perf/session-getinfo-no-throw

JohnMcLear commented May 16, 2026

Uh oh!

qodo-code-review Bot commented May 16, 2026

Uh oh!

qodo-free-for-open-source-projects Bot commented May 16, 2026

Uh oh!

qodo-free-for-open-source-projects Bot commented May 16, 2026 •

edited

Loading

Uh oh!

JohnMcLear commented May 16, 2026

Uh oh!

JohnMcLear commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

JohnMcLear commented May 16, 2026

Summary

Profile evidence

Behavior

Test plan

Uh oh!

qodo-code-review Bot commented May 16, 2026

Qodo reviews are paused for this user.

Uh oh!

qodo-free-for-open-source-projects Bot commented May 16, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-free-for-open-source-projects Bot commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Great, no issues found!

Uh oh!

JohnMcLear commented May 16, 2026

N=3 measured impact

Uh oh!

JohnMcLear commented May 16, 2026

#7775 + #7776 stacked — N=3 combined measurement

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qodo-free-for-open-source-projects Bot commented May 16, 2026 •

edited

Loading