Skip to content

refactor: updating CAPX version to 1.9.x#10770

Open
abhay-nutanix wants to merge 2 commits into
aws:mainfrom
abhay-nutanix:issue/CAPXBump
Open

refactor: updating CAPX version to 1.9.x#10770
abhay-nutanix wants to merge 2 commits into
aws:mainfrom
abhay-nutanix:issue/CAPXBump

Conversation

@abhay-nutanix
Copy link
Copy Markdown
Contributor

@abhay-nutanix abhay-nutanix commented May 14, 2026

Summary

This PR upgrades the Cluster API Provider Nutanix (CAPX) dependency from v1.3.2 to v1.9.2 and migrates the Nutanix provider integration from the legacy Prism v3 API to the v4 converged API. This is a significant refactor that modernizes EKS Anywhere's Nutanix provider layer to align with the latest upstream CAPX types and Nutanix API clients.

Motivation

  • CAPX 1.9.x introduces new types (failure domains, cluster templates), updated CRDs, and a shift to the Nutanix v4 API surface.
  • The legacy prism-go-client/v3 API has been superseded by the converged v4 client and native Nutanix Go SDK modules (ntnx-api-golang-clients).
  • Keeping EKS Anywhere aligned with upstream CAPX is necessary for long-term supportability and access to new Nutanix platform features.

Changes

Dependency Updates

Dependency Old Version New Version
cluster-api-provider-nutanix v1.3.2 v1.9.2
prism-go-client v0.3.4 v0.7.1
sigs.k8s.io/cluster-api v1.6.2 v1.12.3
sigs.k8s.io/controller-runtime v0.22.4 v0.22.5
k8s.io/* modules v0.34.2 v0.34.3
aws-sdk-go-v2 v1.30.1 v1.41.1
smithy-go v1.20.3 v1.24.0

New direct dependencies added:

  • github.com/nutanix/ntnx-api-golang-clients/clustermgmt-go-client/v4
  • github.com/nutanix/ntnx-api-golang-clients/iam-go-client/v4
  • github.com/nutanix/ntnx-api-golang-clients/networking-go-client/v4
  • github.com/nutanix/ntnx-api-golang-clients/prism-go-client/v4
  • github.com/nutanix/ntnx-api-golang-clients/vmm-go-client/v4

API Migration (v3 → v4 Converged)

  • Client interface (pkg/providers/nutanix/client.go): Replaced v3 method signatures with v4 converged API equivalents. Methods now use native Nutanix v4 SDK models (subnetModels.Subnet, imageModels.Image, clusterModels.Cluster, prismModels.Category, etc.) and accept converged.ODataOption for server-side filtering. ListAllProject remains on v3 as no v4 equivalent exists yet.
  • ClientCache (pkg/providers/nutanix/clientcache.go): Updated to construct v4converged.Client instead of v3.Client. The cache now wraps the converged client and delegates to its typed sub-clients (Subnets, Images, Clusters, Hosts, Categories, GPUs).
  • PrismClient (internal/pkg/nutanix/prismclient.go): Migrated from v3.NewV3Client to v4converged.NewClient. All lookup methods (GetImageUUIDFromName, GetClusterUUIDFromName, GetSubnetUUIDFromName) now use server-side OData filtering instead of client-side list-and-filter.

Validator Refactoring

  • Cluster filtering: Replaced the "exclude Prism Central" negative filter with a positive hasPEClusterServiceEnabled() check that matches on the CLUSTERFUNCTIONREF_AOS cluster function — matching the approach used by upstream CAPX.
  • Credential validation: Replaced GetCurrentLoggedInUser (removed in v4) with a ListClusters(ctx, WithLimit(1)) call as a lightweight connectivity/auth check.
  • Category validation: Replaced GetCategoryKey + GetCategoryValue two-step lookups with a single ListCategories call using a combined OData filter (key eq '...' and value eq '...').
  • GPU validation: Introduced an availableGPU abstraction to bridge physical and virtual GPU profiles from the v4 API. GPU availability is now queried per-cluster via ListClusterPhysicalGPUs and ListClusterVirtualGPUs instead of iterating host responses.
  • Subnet validation: Added overlay subnet awareness — overlay subnets are no longer required to be tied to a specific PE cluster.
  • Project validation: Inlined project lookup logic (previously delegated to findProjectUUIDByName) using ListAllProject (v3) since there is no v4 equivalent yet.
  • Image name matching: Now uses strings.EqualFold for case-insensitive comparison.

CAPX Vendored Types (internal/thirdparty/capx/)

  • Updated existing types: NutanixCluster, NutanixMachine, NutanixMachineTemplate, conditions
  • New types added:
    • NutanixClusterTemplate — cluster infrastructure template support
    • NutanixFailureDomain — failure domain CRD for topology-aware deployments
    • nutanix_types.go — shared Nutanix-specific type definitions
  • Updated CRD manifests for all Nutanix infrastructure resources
  • Regenerated zz_generated.deepcopy.go

Test Updates

  • Updated mocks (pkg/providers/nutanix/mocks/client.go) to reflect the new Client interface
  • Refactored test cases in provider_test.go and validator_test.go to use v4 model types and match the updated validation logic
  • Reduced test boilerplate by consolidating mock setup patterns

Risk Assessment

  • High: This is a large dependency upgrade touching the core Nutanix provider path. The v3→v4 API migration changes how resources are queried and filtered.
  • Mitigation: The Client interface is preserved as a seam, so all downstream consumers are insulated from the API version change. Tests have been updated to validate the new behavior.

Breaking Changes

  • None for end users. The Client interface has changed, which affects internal consumers and mock generation only.

@eks-distro-bot
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign rahulbabu95 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@eks-distro-bot
Copy link
Copy Markdown
Collaborator

Hi @abhay-nutanix. Thanks for your PR.

I'm waiting for a aws member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@eks-distro-bot eks-distro-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label May 14, 2026
@sp1999
Copy link
Copy Markdown
Member

sp1999 commented May 14, 2026

/ok-to-test

@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

❌ Patch coverage is 75.48077% with 51 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.25%. Comparing base (c8c965a) to head (eb0a2e2).

Files with missing lines Patch % Lines
pkg/providers/nutanix/clientcache.go 30.76% 26 Missing and 1 partial ⚠️
pkg/providers/nutanix/validator.go 85.54% 16 Missing and 8 partials ⚠️

❌ Your patch check has failed because the patch coverage (75.48%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10770      +/-   ##
==========================================
- Coverage   72.30%   72.25%   -0.06%     
==========================================
  Files         608      608              
  Lines       39388    39443      +55     
==========================================
+ Hits        28481    28499      +18     
- Misses       9173     9203      +30     
- Partials     1734     1741       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@peirulu
Copy link
Copy Markdown
Member

peirulu commented May 20, 2026

/hold

will need to upgrade prism version first before updating CAPX to 1.9.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold ok-to-test size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants