Skip to content
Draft
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion execution_chain/beacon/api_handler/api_forkchoice.nim
Original file line number Diff line number Diff line change
Expand Up @@ -199,13 +199,26 @@ proc forkchoiceUpdated*(ben: BeaconEngineRef,
raise invalidForkChoiceState("safe block not in canonical tree")
# similar to headHash, safeBlockHash is saved by FC module

(await chain.queueForkChoice(headHash, finalizedBlockHash, safeBlockHash)).isOkOr:
# Pre-validate that the forkchoice update will succeed in the queue worker.
# We do this synchronously so we can surface a per-call error here; once the
# worker has been kicked off (below), errors can no longer be reported back
# to this fCU response.
chain.preflightForkChoice(headHash, finalizedBlockHash, safeBlockHash).isOkOr:
return invalidFCU(error, chain, header)

# Enqueue the actual forkchoice apply. For ack-only fCUs we don't await
# the result. This avoids the response getting blocked behind the shared queue worker
Comment thread
advaita-saha marked this conversation as resolved.
let fcuFut = chain.queueForkChoice(headHash, finalizedBlockHash, safeBlockHash)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for the queueForkChoice background job to fail in the background? Or after running preflightForkChoice and returning a success to the caller , should the job always succeed?

Copy link
Copy Markdown
Contributor Author

@advaita-saha advaita-saha May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically it is not possible to fail.
The only possibility is if the db is corrupt, or somehow the db connection is broken ( or other db level failures )

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it might be possible for it to fail when processing multiple heads. The removeBlockFromCache function can remove blocks from the hashToBlock table which can cause forkChoice to fail. removeBlockFromCache gets called from two places updateBase and updateFinalized. If those functions get called asynchronously before the forkChoice task runs then you get a silent failure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the queueForkChoice just adds the fCU call to the asyncQueue. So nothing else calls it async. And the queue is processed one by one

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removeBlockFromCache gets called from two places updateBase and updateFinalized. If those functions get called asynchronously before the forkChoice task runs then you get a silent failure.

I think not possible in the current design with asyncQueue implementation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure. Maybe its ok because the single queue enforces the ordering as long as there are no await calls which yield to other tasks before the forkChoice task is added to the queue.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jangko any thoughts on this ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not sure. Maybe its ok because the single queue enforces the ordering as long as there are no await calls which yield to other tasks before the forkChoice task is added to the queue.

Since the queueForkChoice is a template and contains an await call (which is good for rate limiting purposes), then the await will yield the event loop meaning something else can run concurrently perhaps before the forkChoice task is added to the queue. Maybe this is fine as long as no other code calls updateBase or updateFinalized outside of the task queue flow.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is true.
For now by design updateBase & updateFinalized is not called outside the queue

Will add some comments on this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is nothing logging the result of the forkChoice call. Even if we don't expect any failures we should probably log something. Perhaps the queueForkChoice asyncHandler could log the result of forkChoice.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely yeah, will add some logs. Will help identify issues


# If payload generation was requested, create a new block to be potentially
# sealed by the beacon client. The payload will be requested later, and we
# might replace it arbitrarilly many times in between.
if attrsOpt.isSome:
# Payload generation reads `chain.latestHeader`, so we must wait for the
# apply to complete before assembling the block.
(await fcuFut).isOkOr:
return invalidFCU(error, chain, header)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we end up awaiting here for the block building it might be better to not use the async queue flow at all if attrsOpt.isSome. Block building would be faster in that case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that PR is also ready on top of this. Just testing and ironing out few issues on that
( little difficult to test as I don't have much validators, so very less proposals to test )


let attrs = attrsOpt.value
validateVersion(attrs, com, apiVersion)

Expand Down
45 changes: 44 additions & 1 deletion execution_chain/core/chain/forked_chain.nim
Original file line number Diff line number Diff line change
Expand Up @@ -743,7 +743,10 @@ proc forkChoice*(c: ForkedChainRef,
if safe.isOk:
c.fcuSafe.number = safe.number
c.fcuSafe.hash = safeHash
?safe.txFrame.fcuSafe(c.fcuSafe)
# Use `.expect(...)` like the sibling fcuHead/fcuFinalized writes so a
# DB-level failure here is a hard fault rather than a recoverable err -
# the RPC handler may have already acked the fCU before we run.
safe.txFrame.fcuSafe(c.fcuSafe).expect("fcuSafe OK")

if headHash == c.latest.hash:
if finalizedHash == zeroHash32:
Expand Down Expand Up @@ -813,6 +816,46 @@ template queueForkChoice*(c: ForkedChainRef,
await c.queue.addLast(item)
item.responseFut

func preflightForkChoice*(c: ForkedChainRef,
headHash: Hash32,
finalizedHash: Hash32,
safeHash: Hash32 = zeroHash32):
Result[void, string] =
## Pre-validate a forkchoice update against the current in-memory chain,
## mirroring the failure paths inside `forkChoice()`. Run this from the RPC
## handler before enqueuing so we don't ack a request that the queue worker
## would later reject (since the ack has already been sent by then).
##
## Invariant: between this check and the worker actually running our
## enqueued item, the only mutators of `c.heads`/`hashToBlock` are
## previously-queued `validateBlock` (adds heads, never removes the
## validated one) and `updateFinalized` (only prunes branches that don't
## contain the finalized block we already verified is on head's chain).
## So a successful preflight remains valid by the time the worker runs.
Comment thread
advaita-saha marked this conversation as resolved.
let head = c.hashToBlock.getOrDefault(headHash)
if head.isNil:
return err("Cannot find head block: " & headHash.short)

if safeHash != zeroHash32 and c.hashToBlock.getOrDefault(safeHash).isNil:
return err("Cannot find safe block: " & safeHash.short)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is more strict now than the implementation in forkChoice which would silently ignore if the safeHash doesn't exist in the hashToBlock map. Not sure if this change is desired or not.


if finalizedHash != zeroHash32:
let fin = c.hashToBlock.getOrDefault(finalizedHash)
if fin.isNil:
return err("Cannot find finalized block: " & finalizedHash.short)
if fin.number > head.number:
return err("Invalid finalizedHash: block is newer than head block")
if c.heads.len > 1:
var found = false
for it in ancestors(head):
if it == fin:
found = true
break
if not found:
return err("Invalid finalizedHash: block not in argument head ancestor lineage")

ok()

func resolvedFinHash*(c: ForkedChainRef): Hash32 =
c.latestFinalized.hash

Expand Down
Loading