Skip to content

feat: Improve config#3866

Draft
fcecin wants to merge 1 commit into
masterfrom
feat/api-consistency
Draft

feat: Improve config#3866
fcecin wants to merge 1 commit into
masterfrom
feat/api-consistency

Conversation

@fcecin
Copy link
Copy Markdown
Contributor

@fcecin fcecin commented May 8, 2026

Description

NOTE: This PR addresses most points raised in #3845, with the exception of removing --mode and createNode(conf: WakuNodeConf).

This PR changes how network presets work by making the various scalar field values in a selected preset "lose" to any scalar values provided explicitly by user configs instead. And in the case of list parameters (seq), user-provided values can either override the preset or concatenate to the preset. This is dealt with by a new "additions?" optional field in the configJson parameter provided to the C FFI API createNode.

A new type WakuNodeConfOverlay is auto-generated from the WakuNodeConf authoritative type using a macro that makes all fields relevant to overrides and additions Option[T] where the T type of the field in the original WakuNodeConf type wasn't already an Option.

Some int/bool fields in WakuNodeConf itself had to be promoted to Option[int/bool] so that the new applyNetworkConf proc could apply the new logic of user values overriding preset values without relying on chosen sentinel int or bool values for each field.

The JSON parser inside the FFI entry point for createNode was moved to a dedicated parser that was also rewritten.

Literal value defaults across the various config builder files were replaced with DefaultXXX const values instead. These const defaults owned by each conf builder are referenced in the CLI help text whenever the field in question has been upgraded to Option, which results in the CLI option having a default of none (unset), which is not very helpful for the generated help text.

And finally, when using the new createNode(preset, mode, overrides, additions) Nim API, which signals a library app developer use case, this PR defaults the p2pTcp, discv5Udp and websocket ports to zero (0), which now mean a request to automatically choose and bind a port. That closes the issues related to port collisions for app developers running multiple instances of logos-delivery on the same machine. For reference, the complete discussion on this topic can be found in PR #3828's conversation tab.

Changes

  • Add createNode(preset, mode, overrides, additions) nim api
  • Set p2pTcp/discv5Udp/websocket ports to 0 (auto-bind) in new createNode()
  • Soft-deprecate --cluster-id=N triggering the associated preset selection
  • Rewrite applyNetworkConf to apply user-set fields over preset fields
  • Generate WakuNodeConfOverlay (all Option fields) from WakuNodeConf
  • New parser for configJson handles new messaging shape and full conf shape
  • Change all confbuilder defaults from literal values to DefaultXXX consts
  • Change int/bool WakuNodeConf fields to Option to get user intent w/o sentinels
  • Make Option CLI default-value help mention defaults now owned by confbuilder
  • Misc refactors, fixes
  • Add tests

Issue

iterates #3845

closes logos-co/logos-delivery-module#18

closes #3819

@fcecin fcecin changed the title Improve config feat: Improve config May 8, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:3866

Built from b2f8385

* Add createNode(preset, mode, overrides, additions) nim api
* Set p2pTcp/discv5Udp/websocket ports to 0 (auto-bind) in new createNode()
* Soft-deprecate --cluster-id=N triggering the associated preset selection
* Rewrite applyNetworkConf to apply user-set fields over preset fields
* Generate WakuNodeConfOverlay (all Option fields) from WakuNodeConf
* New parser for configJson handles new messaging shape and full conf shape
* Change all confbuilder defaults from literal values to DefaultXXX consts
* Change int/bool WakuNodeConf fields to Option to get user intent w/o sentinels
* Make Option CLI default-value help mention defaults now owned by confbuilder
* Misc refactors, fixes
* Add tests
@fcecin fcecin force-pushed the feat/api-consistency branch from 098c6f2 to 4cb4d29 Compare May 11, 2026 22:50
@fcecin fcecin marked this pull request as ready for review May 11, 2026 23:46
Comment on lines +240 to +241
## Parse a JSON config, route to messaging or full-config shape based on
## whether `overrides` or `additions` fields are in the config object top-level.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think overrides and additions gives too much flexibility, and introduces too much unwated complexity.

I was envisioning the overrides as a concrete list of possible advanced configuration of MessagingClient. Otherwise it allows to have e.g. mode: Core with overrides {relay: false}, which is an invalid config I believe.

At the moment I would just not add overrides/additions at all. And later, introduce certain overrides on demand.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR, "overrides" and "additions" is just the way the user accesses the semantics and capability of achieving any configuration goal, at any level (kernel, messaging, and channels -- upcoming), just like mode and preset are (in principle) free to affect configuration at any layer.

As for "invalid configs," this brings up the more fundamental discussion (which I guess we will have a meeting about very soon) about what is it that developers have accessible to them. If we choose (my preferred route) of giving the user full control, that means by "user" we also mean e.g. QA/DST, so the node is fully testable -- so "relay: false" is not something for us to judge whether that's "valid" or not; if the user did it, then it is valid.

An alternative is to do what Ivan has already anticipated his comments below: rewrite the config system and split it into layers also (somehow). So maybe the CLI retains the "I can configure any layer" semantics, but via library you are accessing a different configuration pipeline for each (kernel?, messaging, channels) layer.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fcecin I like the suggestion of layered configuration.

As for the flexibility point, it can be achieved this way (which I mentioned in #3865):

Kernel API

# full flexibility, build what you need
proc createNode(WakuNodeConf)

Messaging API

# no flexibility, only preset+mode (and maybe some messaging client specific config)
proc createMessagingClient(preset, mode)
    # Translate preset+mode -> WakuNodeConf
    # Create the node with createNode(wakuNodeConf)

# adds flexibility, you can create the node directly with Kernel API and then attach a Messaging Client. Without layering violations.
proc attachMessagingClient(node)
    # attaches a messaging client to a running node

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I'll figure out how to translate this to PR.

Copy link
Copy Markdown
Collaborator

@Ivansete-status Ivansete-status left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice refactor, thanks!
Just curious how to properly handle a ternary Option[bool] attribute.
In the upcoming future I think we should only have three or four config object types, e.g., ChannelApiConf, WakuAppConf, MessagingApiConf, KernelApiConf.

defaultValue: none(uint16),
name: "cluster-id"
.}: uint16
.}: Option[uint16]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What could be an scenario where a node doesn't need to know about a cluster-id?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The node would know, via a --preset. You just don't need to inform one via the CLI.

defaultValue: none(bool),
name: "reliability"
.}: bool
.}: Option[bool]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How we should interpret a none value? As a true or false :) ?
Questions applies elsewhere.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is interpreted as literally none -- the CLI didn't specify one via this config knob. Then we have a dense pipeline of decision-making downstream that will either populate the configured concern somehow or blow up with "I reached this point and it's still set to none," same as the cluster-id case.

## hence parsed at the messaging shape's top level and rejected inside
## `overrides`/`additions` to avoid ambiguity.

proc collectJsonFields*(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just fyi, this week or after, nim-ffi will handle an arbitrary Config type, and the wire conversion (through CBOR) will happen within nim-ffi internally.
In the upcoming future we should not have to manually deal with that as nim-ffi will be more type safe.

# Generates the WakuNodeConfOverlay type from the WakuNodeConf type.
# The generated type converts fields from type T to Option[T] if T != Option.
# Skips fields that are in WakuNodeConfOverlayExcludes.
optionalizeType(WakuNodeConfOverlay, WakuNodeConf, WakuNodeConfOverlayExcludes)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if is better to be more explicit and just add the regular type definition.
Maybe not needed to use a macro for this particular case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either a macro or a hand-maintained mirror type, in any case I was just following the design from @igor-sirotin literally (or trying to). If we could instead make the original WakuNodeConf be forcefully made entirely of Option[T], then we wouldn't need a second type.

I think I've seen Igor arguing about CLI breakage being OK for the window of time we have now, in which case yes, we should just have one config object that is Option[T], which allows it to function as a base config and as a config overlay.

@@ -0,0 +1,74 @@
{.push raises: [].}

import std/[macros, options]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not needed this file if agree with previous comment.

Comment thread waku/api/api.nim
devPorts.websocketPort = some(Port(0))
applyAsOverride(conf, devPorts)

proc createNode*(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note for the future. We might need to have this proc within the upcoming refactoring, inside the messaging/api section.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, the refactoring PR is going to move a lot of stuff around. My current draft is +45,000 lines.

@fcecin
Copy link
Copy Markdown
Contributor Author

fcecin commented May 13, 2026

Nice refactor, thanks! Just curious how to properly handle a ternary Option[bool] attribute. In the upcoming future I think we should only have three or four config object types, e.g., ChannelApiConf, WakuAppConf, MessagingApiConf, KernelApiConf.

I think the case for nuking the CLI / config stuff and rewriting this is about right now. So maybe we should close this PR and submit a new one with a full conf pipeline redesign and rewrite. If we can afford the breakage budget.

@fcecin
Copy link
Copy Markdown
Contributor Author

fcecin commented May 14, 2026

As for right now, I'm going to slightly retarget my attetion to the main layer-refactoring task. This will move in parallel while I figure out how to precisely synthesize the result of our decisions in the last API design meeting, and the new constraints that we have articulated there, w.r.t. to the config types and config surface in general.

We can continue discussing the config surface design here -- the more information, the better.

EDIT: Just so that this comment isn't entirely useless:

# no flexibility, only preset+mode (and maybe some messaging client specific config)
proc createMessagingClient(preset, mode)
    # Translate preset+mode -> WakuNodeConf
    # Create the node with createNode(wakuNodeConf)

... we realized in the meeting that:

  1. "(preset, mode)" is likely not sufficient to fully express an "easy-start-up intent" with removed power and thus increased safety and UX for app developers (ie. removing footguns and "baby-proofing" the config / node creation entry point).
  2. But at the same time, we realized that we do want a different, specialized signature for higher-level creation of e.g. a MessagingClient at the Messaging layer of API access that is not just "here's the full conf object" (be it proxied via Json <--> single cstring FFI arg, or be it some native struct/type FFI bridge machinery).

And also at the same time, in the review of this PR in its current form, we realized here that (preset, mode, overrides, additions) as implemented here is not what we want either, because it gives too much config freedom with invalid combinations, etc.

Since that decision and/or its implementation are clearly affected by what the relatively large re-layering refactor will do, it is probably a good idea to delay config improvements for after the re-layering effort. But we can also decide something here and push this first; that also works if we want to make the calls now and get this done with.

(Also I understand the whole CLI vs Library issue is 100% settled; the CLI adjustments can be resolved after we decide the config-related APIs and config struct type(s) we want).

@fcecin fcecin marked this pull request as draft May 15, 2026 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow port selection automatic with config to 0 Support external override of all listening ports via environment variables

3 participants