Skip to content

Flush mine during upgrade#4934

Open
eg-ayoub wants to merge 6 commits into
development/133.0from
bugfix/salt-mine-breaks-on-update
Open

Flush mine during upgrade#4934
eg-ayoub wants to merge 6 commits into
development/133.0from
bugfix/salt-mine-breaks-on-update

Conversation

@eg-ayoub
Copy link
Copy Markdown
Contributor

@eg-ayoub eg-ayoub commented May 15, 2026

MK8S-217 The mine can be flushed and updated during a metalk8s upgrade to avoid errors where the mine cache gets corrupted.
this is done by a function in upgrade.sh that runs as soon as possible after the bootstrap upgrade to flush the mine

@eg-ayoub eg-ayoub requested a review from a team as a code owner May 15, 2026 07:45
@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented May 15, 2026

Hello eg-ayoub,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented May 15, 2026

Request integration branches

Waiting for integration branch creation to be requested by the user.

To request integration branches, please comment on this pull request with the following command:

/create_integration_branches

Alternatively, the /approve and /create_pull_requests commands will automatically
create the integration branches.

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 15, 2026

LGTM

Review by Claude Code

Comment thread scripts/upgrade.sh.in Outdated
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 15, 2026

  • mine.flush / mine.update may be called before the salt-master daemon is ready to accept connections (container running ≠ service ready), causing the upgrade to abort under set -e
    - Wrap both calls with the existing retry helper, matching the pattern used in salt_minion_list

    Review by Claude Code

Comment thread CHANGELOG.md
Comment thread scripts/upgrade.sh.in Outdated
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 15, 2026

  • CHANGELOG.md:8 — Grammar error: "fails prints" should be "prints"
  • scripts/upgrade.sh.in:226 — Step label says "on bootstrap" but function targets all minions via salt '*'

Review by Claude Code

Comment thread scripts/upgrade.sh.in
# Run on all minions so the master has a clean, up-to-date mine cache for
# the entire cluster before any upgrade state queries it.
SALT_MASTER_CALL=("${EXEC_CONTAINER_COMMAND[@]}" "$(get_salt_container)")
"${SALT_MASTER_CALL[@]}" salt '*' mine.flush
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_quiet disables set -e, so if mine.flush fails, execution continues to mine.update. Since the whole point is to delete the corrupt cache first, a failed flush means mine.update hits the same corrupt file — exactly the bug this function exists to fix. Check the return code of mine.flush before proceeding.

```suggestion
"${SALT_MASTER_CALL[@]}" salt '*' mine.flush || return 1

Comment thread pillar/metalk8s/roles/ca.sls
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 15, 2026

  • flush_and_refresh_mine does not check the return code of mine.flush before running mine.update. Under run_quiet (set +e), a failed flush lets mine.update hit the same corrupt cache — defeating the fix.
    • Add || return 1 after mine.flush so the step fails fast.
  • dex_server_policy in ca.sls (line 75) still references dex key/cert paths unconditionally, while the new guard only covers dex_ca_b64.
    • Consider wrapping dex_server_policy in the same pillar.addons.dex.enabled conditional.

Review by Claude Code

@eg-ayoub eg-ayoub force-pushed the bugfix/salt-mine-breaks-on-update branch from ce63c07 to 27ceac6 Compare May 15, 2026 13:28
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 15, 2026

LGTM

Review by Claude Code

eg-ayoub added 6 commits May 18, 2026 09:37
MK8S-217 The mine can be flushed and updated during a metalk8s upgrade
to avoid errors where the mine gets corrupted.
MK8S-218 in deployments where dex is disabled, the mine.update functions
are flooded with warnings from the dex mine function failures
we can fix this by gating this mine function behind the actual state of
dex.
the function won't be declared if dex is disabled.
@eg-ayoub eg-ayoub force-pushed the bugfix/salt-mine-breaks-on-update branch from 27ceac6 to dc852db Compare May 18, 2026 07:38
@eg-ayoub
Copy link
Copy Markdown
Contributor Author

/approve

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented May 18, 2026

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/123.0
  • development/124.0
  • development/124.1
  • development/125.0
  • development/126.0
  • development/127.0
  • development/128.0
  • development/129.0
  • development/130.0
  • development/131.0
  • development/132.0
  • development/2.0
  • development/2.1
  • development/2.10
  • development/2.11
  • development/2.2
  • development/2.3
  • development/2.4
  • development/2.5
  • development/2.6
  • development/2.7
  • development/2.8
  • development/2.9

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

The following options are set: approve

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented May 18, 2026

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

Peer approvals must include at least 1 approval from the following list:

The following options are set: approve

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 18, 2026

LGTM

Review by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants