feat(mix): cover traffic with constant rate and spam protection toggle#3807
feat(mix): cover traffic with constant rate and spam protection toggle#3807chaitanyaprem wants to merge 8 commits into
Conversation
…ption_shims crash - Add cover traffic support with constant rate as per spec - Add mix-user-message-limit and mix-disable-spam-protection CLI flags - Fix option_shims.nim double-evaluation bug causing UnpackDefect crash in ping (template expanded await expression twice, racing two calls) - Reduce default rate limit to 2 msgs/epoch for simulation testing - Add check_cover_traffic.sh metrics monitoring script Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
You can find the image built from this PR at Built from 34de568 |
6f269f2 to
f78aa61
Compare
Cover Traffic + RLN Spam Protection Simulation ReportDate: 2026-04-10 | Duration: ~10 minutes System Configuration
Simulation Configuration
System Resource Usage
CPU usage is bursty — each node generates 1-2 RLN proofs per 10s epoch (~200ms of work), then idles. Spikes are brief and non-disruptive on this hardware. RLN Proof Generation Times206 proofs collected across all 5 nodes:
Distribution: 60% of proofs complete under 100ms. Tail latency (300ms+) occurs when multiple nodes generate proofs simultaneously at epoch boundaries (single-threaded chronos async per process). Cover Traffic Metrics (Final Snapshot)
Errors Observed
The ConclusionCover traffic with RLN spam protection works correctly on consumer hardware (M2 Pro). Resource usage is manageable with R=2 (~5% steady CPU for 5 nodes). The main limitation is the low rate limit causing slot exhaustion during forwarding — this is a configuration constraint, not a bug. |
Addendum: Rate Limit Analysis & Recommended ConfigurationThe Root Cause: With R=2 (user message limit) and L=3 (path length):
Because 0.5 is fractional, epoch boundary jitter causes some epochs to see 2 cover emissions and others 0. When 2 land in the same epoch, all slots are consumed by cover traffic, leaving none for forwarding — triggering proof generation failures at intermediate hops. Fix: R must be a multiple of
Recommended: R=4 — exactly 1 cover packet per epoch, 3 remaining slots for forwarding, no fractional math or boundary jitter issues. This is also less CPU load than R=2 in practice (fewer spurious double-emissions). The config files ( |
f78aa61 to
3a243f7
Compare
Defensive nil check for mixRlnSpamProtection when spam protection is disabled, preventing potential crash if the guard in start() is ever bypassed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3a243f7 to
acab100
Compare
Cover Traffic Precomputation + RLN Spam Protection Simulation ReportDate: 2026-04-10 | Duration: ~4 minutes | Config change: System Configuration
Simulation Configuration
Key Finding: Precomputation Produces 0 Packets with R=2With All cover traffic falls through to on-demand building. The For precomputation to be effective, R must be ≥ 4 (so System Resource Usage
No measurable difference from the non-precomputation run — expected since precomputation built 0 packets. RLN Proof Generation Times (124 proofs)
Distribution: Consistent with the previous run (avg 112.9ms vs 116.0ms). Cover Traffic Metrics (Final — 240s)
ConclusionPrecomputation with R=2 is a no-op due to integer division yielding 0 target packets. The precompute loop activates but produces nothing, falling back entirely to on-demand packet building. To test precomputation meaningfully, R should be set to ≥4 (yielding ≥1 precomputed packet per epoch). All other behavior (emission, forwarding, slot exhaustion, proof generation) is consistent with the non-precomputation run. |
Proof Token Reuse Validation — R=4 with Precomputation + RLNDate: 2026-04-10 | Duration: ~3 minutes | Feature: Opaque proof token reuse for discarded precomputed cover packets What ChangedWhen a precomputed cover packet is discarded (because a forwarded/originated packet claims its slot), the RLN messageId used for that proof is now reclaimed and reused instead of being wasted. This prevents premature rate limit exhaustion. Implementation uses opaque proof tokens — Results: Zero Proof Failures ✅
43 total proof tokens reclaimed and reused across all nodes. Zero Comparison: Before vs After Proof Token Reuse
MessageId Reuse Trace (Node 1)System Resources
Configuration
|
4a44f54 to
eefe805
Compare
- nim-libp2p: adds ProofResult, reclaimProofToken, same-epoch precomp - mix-rln-spam-protection-plugin: implements messageId reuse from discarded cover packets Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
eefe805 to
8add44d
Compare
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cover Rate Fraction — Simulation Comparison (f=0.7 vs f=1.0)Date: 2026-04-23 | Feature: Tracks upstream PR vacp2p/nim-libp2p#2322. Vendor bump in this branch: Setup (identical for both runs)
Per-node totals over the run
Per-node per-epoch averages
Observations
Vs. prior proof-token-reuse report on this PRPrior run (R=4, f=1.0 implicit, precomputation enabled, 3 min, comment): ~1 emission/epoch/node, zero proof failures (thanks to token reuse). Current f=0.7 (precomputation disabled, 4.3 min): 0.5 emissions/epoch/node, zero proof failures. Qualitative behavior (loop-back healthy, no failures) preserved; quantitative drop of ~50% per-node traffic as expected. |
bf2f8a9 to
ee2e30a
Compare
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ee2e30a to
320c13f
Compare
- rename ExtendedKademliaDiscoveryParams -> ExtendedServiceDiscoveryParams - switch sink from textlines[file] to textlines to work around a chronicles compile-time macro-eval bug under Nim 2.2.4
jm-clius
left a comment
There was a problem hiding this comment.
Good progress. Feel free to raise a review once this is ready for merging. One concern is complexity in configuration - if we have restrictions on certain configuration items (e.g. user message rate limit should be a multiple of 4), we should refactor our configuration to prevent misconfiguration.
There was a problem hiding this comment.
This is interesting. Any reason why these functions were removed? Shouldn't we adapt to whatever is considered the most idiomatic, or use a different library?
There was a problem hiding this comment.
Forgot to mention that this was. temporary hack that was done to avoid changes to be made in rest of logos delivery code.
I think the original issue came from libp2p migrating to a newer version which was causing issues in code no related to changes in this PR.
This file should be removed before merging as i assume this would get fixed by some other PR in master.
| DefaultUserMessageLimit = 100'u64 # Network-wide default rate limit | ||
| SpammerUserMessageLimit = 3'u64 # Lower limit for spammer testing | ||
| DefaultUserMessageLimit = 4'u64 # R=4 slots per 10s epoch | ||
| SpammerUserMessageLimit = 3'u64 # Higher limit for spammer testing |
There was a problem hiding this comment.
Truly a nitpick, but I'm trying to parse the configuration here and elsewhere, and I don't understand the change of "Lower" to "Higher" here. 😅
There was a problem hiding this comment.
Ah, this was something i added initially to simulate a spammer but later realized that there is no way to simulate it as zerokit itself seems to have a check once a user crosses his own message limit, it doesnt generate proofs. So, right now this is more of useless code, will try to remove it.
| if userMessageLimit.isSome(): | ||
| spamProtectionConfig.userMessageLimit = userMessageLimit.get() | ||
| # rlnResourcesPath left empty to use bundled resources (via "tree_height_/" placeholder) | ||
| let totalSlots = userMessageLimit.get(2) |
There was a problem hiding this comment.
Why use the magic number 2 here and not the default configuration (besides, IIUC your comments previously - this will result in fractional cover-messages-per-epoch)
There was a problem hiding this comment.
Oh, this is not right.
You are right and we should use default config. Will fix it.
ah, right...but that is just for simulation env. Once we migrate this to onchain registry, this config would not be required or rather will be fetched from there itself. |
Summary
ConstantRateCoverTraffic)mix-user-message-limitandmix-disable-spam-protectionCLI flags for flexible testingoption_shims.nimdouble-evaluation bug that causedUnpackDefectcrash in ping — template expandedawaitexpression twice, racing two separate network callscheck_cover_traffic.shmetrics monitoring script for simulation validationHow to Test
See simulations/mixnet/README.md for full setup and testing instructions.
Note: Running with DoS protection enabled requires better hardware. RLN proof generation is computationally expensive — running 5 nodes locally with proofs enabled can overwhelm a laptop and cause nodes to crash or become unresponsive. For DoS-protected testing, use a machine with sufficient CPU/RAM or reduce the number of nodes.
Validation Report
Cover Traffic Metrics (after ~60s, all 5 nodes stable)
mix_cover_emitted_total(on_demand)mix_cover_received_totalmix_messages_forwarded(Cover)mix_messages_forwarded(Intermediate)mix_messages_recvd(Exit)mix_messages_recvd(Intermediate)SLOT_EXHAUSTEDerrorsKey Observations
mix_cover_emitted_totalincreases every epochmix_cover_received_totalshows cover packets returning to origin nodes via 3-hop mix path