fix: temporary workaround for gnark-crypto AVX-512 innerProdVec SIGSEGV by gusiri · Pull Request #3130 · Consensys/linea-monorepo

gusiri · 2026-05-18T03:43:42Z

This PR
Temporary prover-side workaround for an upstream bug in gnark-crypto's AVX-512 assembly that causes SIGSEGV during aggregation proof generation.
On the caller side, we ensure cap > len on vectors before they reach InnerProduct, so the 4-byte overread lands in valid slack memory. This is a temporary workaround until the upstream fix.

Bug: AVX-512 innerProdVec reads 4 bytes past allocation boundary → SIGSEGV

Summary
fr.Vector.InnerProduct on amd64 with AVX-512 reads 4 bytes past the last element of the input slices. This causes SIGSEGV when the allocation ends at a page boundary.

Environment
gnark-crypto v0.20.2-0.20260402204920-39238e584b99
Go 1.24.x, Linux amd64 with AVX-512

Affected code
field/asm/element_4w/element_4w_amd64.s, lines ~735–758 (innerProdVec loop body):

#define MAC(in0, in1, in2) \
    VPMULUDQ.BCST in0, Z4, Z2  \
    ...

    MAC(0(R13), Z16, Z24)
    MAC(4(R13), Z17, Z25)
    ...
    MAC(24(R13), Z22, Z30)
    MAC(28(R13), Z23, Z31)   // ← 8-byte load at offset 28 of a 32-byte element
    ADDQ $32, R13

MAC(28(R13), ...) expands to VPMULUDQ.BCST 28(R13), Z4, Z2, which loads 8 bytes from R13+28, i.e. bytes [28..35]. Each element is 32 bytes ([0..31]), so bytes [32..35] are past the element boundary.

For elements in the middle of the array, this harmlessly reads the first 4 bytes of the next element. For the last element, those 4 bytes are past the end of the allocation.

Calling code (no capacity guarantee)
vector_amd64.go:

func (vector *Vector) InnerProduct(other Vector) (res Element) {
    ...
    innerProdVec(&res[0], &(*vector)[0], &other[0], uint64(len(*vector)))
    return
}

Raw slice element pointers are passed directly. make([]Element, n) can return cap == len, with no slack bytes after the last element.

Crash conditions
The crash requires the 4-byte overread to cross a virtual memory page boundary. This happens when:

n × 32 is a multiple of the page size (4096 bytes), i.e. n is a multiple of 128
The allocation base is page-aligned (common for large mmap'd allocations)
In our case: n = 262144, allocation = 8 MiB = exactly 2048 pages.

base            = 0xeb73c00000
last element    = base + (262144-1) × 32 = 0xeb743FFFE0
MAC(28) loads   = 0xeb743FFFFC .. 0xeb74400003  ← crosses page at 0xeb74400000

Crash signature

unexpected fault address 0xeb74400000
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x2 addr=0xeb74400000 pc=0x...]

goroutine ... [running]:
runtime: ...
github.com/consensys/gnark-crypto/ecc/bls12-377/fr.innerProdVec(...)
github.com/consensys/gnark-crypto/ecc/bls12-377/fr.(*Vector).InnerProduct(...)

Why tests don't catch it
gnark-crypto's test vectors are small. Go's allocator provides slack capacity for small allocations, so the overread lands in valid memory. The bug only manifests with large, page-aligned allocations.

Checklist

I wrote new tests for my new core changes.
I have successfully ran tests, style checker and build against my new changes locally.
If this change is deployed to any environment (including Devnet), E2E test coverage exists or is included in this
PR.
I have informed the team of any breaking changes if there are any.

Note

High Risk
High risk because it changes core proving/compilation concurrency (new on-the-fly limitless prover path, parallel module build/compile, and ProverRuntime locking semantics) and adds low-level memory workarounds around field-vector operations.

Overview
Adds a new in-memory “on-the-fly” limitless prover (ProveOnTheFly) that skips serialized disk assets, overlaps compilation with bootstrapper proving, pipelines GL/LPP proving into hierarchical conglomeration, and releases compiled circuits early via a usage tracker.

Removes the bespoke JSONL perf_log instrumentation and makes proving concurrency configurable via env vars (e.g. LIMITLESS_SUBPROVER_JOBS), while also refactoring conglomeration/GL/LPP entrypoints to drop the perf logger plumbing.

Implements a temporary crash workaround for a gnark-crypto AVX-512 InnerProduct overread by ensuring vectors have cap > len (padding in ScalarProd and ParBatchInvert).

Speeds up and parallelizes several hot paths: parallel module build/segment compilation with optional debug-module creation, background precompilation of conglomeration, cached Plonk-in-wizard constraint counts, parallelized Vortex column extraction/Merkle proofs, reduced allocations in quotient evaluation, and cached VK columns in the gnark verifier. Also switches wizard.ProverRuntime from Mutex to RWMutex and reduces time spent holding the lock during AssignColumn.

Updates dependencies (gnark-crypto, go-corset, x/sync, x/sys) and tweaks the mainnet limitless config paths.

^{Reviewed by Cursor Bugbot for commit 201b28b. Bugbot is set up for automated code reviews on this repo. Configure here.}

The innerProdVec assembly in gnark-crypto uses VPMULUDQ.BCST at byte offset 28 of each 32-byte element, performing an 8-byte load that reads 4 bytes past the element boundary. On the last element this overreads the allocation, causing SIGSEGV when page-aligned (e.g. n=262144). - ScalarProd: copy receiver with extra capacity when cap==len - ParBatchInvert: allocate result with cap=len+1

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 201b28b. Configure here.}

cursor Bot reviewed May 18, 2026

View reviewed changes

Comment thread prover/config/config-mainnet-limitless.toml

gusiri changed the base branch from main to prod-base May 18, 2026 04:15

This was referenced May 18, 2026

fix(amd64): avoid 4-byte overread in innerProdVec AVX-512 path Consensys/gnark-crypto#841

Merged

chore: update to latest gnark and gnark-crypto #3142

Merged

gbotrel closed this in #3142 May 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: temporary workaround for gnark-crypto AVX-512 innerProdVec SIGSEGV#3130

fix: temporary workaround for gnark-crypto AVX-512 innerProdVec SIGSEGV#3130
gusiri wants to merge 1 commit into
prod-basefrom
bugfix/gnark-crypto-avx512-innerproduct-overread

gusiri commented May 18, 2026 •

edited

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gusiri commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bug: AVX-512 innerProdVec reads 4 bytes past allocation boundary → SIGSEGV

Checklist

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gusiri commented May 18, 2026 •

edited

Loading