feat: Improve config#3866
Conversation
|
You can find the image built from this PR at Built from b2f8385 |
* Add createNode(preset, mode, overrides, additions) nim api * Set p2pTcp/discv5Udp/websocket ports to 0 (auto-bind) in new createNode() * Soft-deprecate --cluster-id=N triggering the associated preset selection * Rewrite applyNetworkConf to apply user-set fields over preset fields * Generate WakuNodeConfOverlay (all Option fields) from WakuNodeConf * New parser for configJson handles new messaging shape and full conf shape * Change all confbuilder defaults from literal values to DefaultXXX consts * Change int/bool WakuNodeConf fields to Option to get user intent w/o sentinels * Make Option CLI default-value help mention defaults now owned by confbuilder * Misc refactors, fixes * Add tests
098c6f2 to
4cb4d29
Compare
| ## Parse a JSON config, route to messaging or full-config shape based on | ||
| ## whether `overrides` or `additions` fields are in the config object top-level. |
There was a problem hiding this comment.
I think overrides and additions gives too much flexibility, and introduces too much unwated complexity.
I was envisioning the overrides as a concrete list of possible advanced configuration of MessagingClient. Otherwise it allows to have e.g. mode: Core with overrides {relay: false}, which is an invalid config I believe.
At the moment I would just not add overrides/additions at all. And later, introduce certain overrides on demand.
There was a problem hiding this comment.
In this PR, "overrides" and "additions" is just the way the user accesses the semantics and capability of achieving any configuration goal, at any level (kernel, messaging, and channels -- upcoming), just like mode and preset are (in principle) free to affect configuration at any layer.
As for "invalid configs," this brings up the more fundamental discussion (which I guess we will have a meeting about very soon) about what is it that developers have accessible to them. If we choose (my preferred route) of giving the user full control, that means by "user" we also mean e.g. QA/DST, so the node is fully testable -- so "relay: false" is not something for us to judge whether that's "valid" or not; if the user did it, then it is valid.
An alternative is to do what Ivan has already anticipated his comments below: rewrite the config system and split it into layers also (somehow). So maybe the CLI retains the "I can configure any layer" semantics, but via library you are accessing a different configuration pipeline for each (kernel?, messaging, channels) layer.
There was a problem hiding this comment.
@fcecin I like the suggestion of layered configuration.
As for the flexibility point, it can be achieved this way (which I mentioned in #3865):
Kernel API
# full flexibility, build what you need
proc createNode(WakuNodeConf)Messaging API
# no flexibility, only preset+mode (and maybe some messaging client specific config)
proc createMessagingClient(preset, mode)
# Translate preset+mode -> WakuNodeConf
# Create the node with createNode(wakuNodeConf)
# adds flexibility, you can create the node directly with Kernel API and then attach a Messaging Client. Without layering violations.
proc attachMessagingClient(node)
# attaches a messaging client to a running nodeThere was a problem hiding this comment.
Got it. I'll figure out how to translate this to PR.
Ivansete-status
left a comment
There was a problem hiding this comment.
Nice refactor, thanks!
Just curious how to properly handle a ternary Option[bool] attribute.
In the upcoming future I think we should only have three or four config object types, e.g., ChannelApiConf, WakuAppConf, MessagingApiConf, KernelApiConf.
| defaultValue: none(uint16), | ||
| name: "cluster-id" | ||
| .}: uint16 | ||
| .}: Option[uint16] |
There was a problem hiding this comment.
What could be an scenario where a node doesn't need to know about a cluster-id?
There was a problem hiding this comment.
The node would know, via a --preset. You just don't need to inform one via the CLI.
| defaultValue: none(bool), | ||
| name: "reliability" | ||
| .}: bool | ||
| .}: Option[bool] |
There was a problem hiding this comment.
How we should interpret a none value? As a true or false :) ?
Questions applies elsewhere.
There was a problem hiding this comment.
It is interpreted as literally none -- the CLI didn't specify one via this config knob. Then we have a dense pipeline of decision-making downstream that will either populate the configured concern somehow or blow up with "I reached this point and it's still set to none," same as the cluster-id case.
| ## hence parsed at the messaging shape's top level and rejected inside | ||
| ## `overrides`/`additions` to avoid ambiguity. | ||
|
|
||
| proc collectJsonFields*( |
There was a problem hiding this comment.
Just fyi, this week or after, nim-ffi will handle an arbitrary Config type, and the wire conversion (through CBOR) will happen within nim-ffi internally.
In the upcoming future we should not have to manually deal with that as nim-ffi will be more type safe.
| # Generates the WakuNodeConfOverlay type from the WakuNodeConf type. | ||
| # The generated type converts fields from type T to Option[T] if T != Option. | ||
| # Skips fields that are in WakuNodeConfOverlayExcludes. | ||
| optionalizeType(WakuNodeConfOverlay, WakuNodeConf, WakuNodeConfOverlayExcludes) |
There was a problem hiding this comment.
I wonder if is better to be more explicit and just add the regular type definition.
Maybe not needed to use a macro for this particular case.
There was a problem hiding this comment.
Either a macro or a hand-maintained mirror type, in any case I was just following the design from @igor-sirotin literally (or trying to). If we could instead make the original WakuNodeConf be forcefully made entirely of Option[T], then we wouldn't need a second type.
I think I've seen Igor arguing about CLI breakage being OK for the window of time we have now, in which case yes, we should just have one config object that is Option[T], which allows it to function as a base config and as a config overlay.
| @@ -0,0 +1,74 @@ | |||
| {.push raises: [].} | |||
|
|
|||
| import std/[macros, options] | |||
There was a problem hiding this comment.
Maybe not needed this file if agree with previous comment.
| devPorts.websocketPort = some(Port(0)) | ||
| applyAsOverride(conf, devPorts) | ||
|
|
||
| proc createNode*( |
There was a problem hiding this comment.
Just a note for the future. We might need to have this proc within the upcoming refactoring, inside the messaging/api section.
There was a problem hiding this comment.
Ah yes, the refactoring PR is going to move a lot of stuff around. My current draft is +45,000 lines.
I think the case for nuking the CLI / config stuff and rewriting this is about right now. So maybe we should close this PR and submit a new one with a full conf pipeline redesign and rewrite. If we can afford the breakage budget. |
|
As for right now, I'm going to slightly retarget my attetion to the main layer-refactoring task. This will move in parallel while I figure out how to precisely synthesize the result of our decisions in the last API design meeting, and the new constraints that we have articulated there, w.r.t. to the config types and config surface in general. We can continue discussing the config surface design here -- the more information, the better. EDIT: Just so that this comment isn't entirely useless: ... we realized in the meeting that:
And also at the same time, in the review of this PR in its current form, we realized here that (preset, mode, overrides, additions) as implemented here is not what we want either, because it gives too much config freedom with invalid combinations, etc. Since that decision and/or its implementation are clearly affected by what the relatively large re-layering refactor will do, it is probably a good idea to delay config improvements for after the re-layering effort. But we can also decide something here and push this first; that also works if we want to make the calls now and get this done with. (Also I understand the whole CLI vs Library issue is 100% settled; the CLI adjustments can be resolved after we decide the config-related APIs and config struct type(s) we want). |
Description
NOTE: This PR addresses most points raised in #3845, with the exception of removing
--modeandcreateNode(conf: WakuNodeConf).This PR changes how network presets work by making the various scalar field values in a selected preset "lose" to any scalar values provided explicitly by user configs instead. And in the case of list parameters (seq), user-provided values can either override the preset or concatenate to the preset. This is dealt with by a new "additions?" optional field in the
configJsonparameter provided to the C FFI APIcreateNode.A new type
WakuNodeConfOverlayis auto-generated from theWakuNodeConfauthoritative type using a macro that makes all fields relevant to overrides and additionsOption[T]where theTtype of the field in the originalWakuNodeConftype wasn't already anOption.Some int/bool fields in
WakuNodeConfitself had to be promoted toOption[int/bool]so that the newapplyNetworkConfproc could apply the new logic of user values overriding preset values without relying on chosen sentinel int or bool values for each field.The JSON parser inside the FFI entry point for createNode was moved to a dedicated parser that was also rewritten.
Literal value defaults across the various config builder files were replaced with
DefaultXXXconst values instead. These const defaults owned by each conf builder are referenced in the CLI help text whenever the field in question has been upgraded toOption, which results in the CLI option having a default ofnone(unset), which is not very helpful for the generated help text.And finally, when using the new
createNode(preset, mode, overrides, additions)Nim API, which signals a library app developer use case, this PR defaults thep2pTcp,discv5Udpandwebsocketports to zero (0), which now mean a request to automatically choose and bind a port. That closes the issues related to port collisions for app developers running multiple instances oflogos-deliveryon the same machine. For reference, the complete discussion on this topic can be found in PR #3828's conversation tab.Changes
Issue
iterates #3845
closes logos-co/logos-delivery-module#18
closes #3819