Upgrading Kubernetes on Bare Metal

This guide explains how to complete Phase 2 of the upgrade workflow for clusters on bare metal. Before you upgrade Kubernetes, complete the Distribution Version upgrade described in Upgrading Clusters.

INFO

Where this page fits in the full ACP upgrade flow

This page covers only the Kubernetes step of the upgrade. The full ACP upgrade flow — including upgrade artifact synchronization, ACP Core upgrade through CVO, Aligned plugin upgrades, and Agnostic plugin upgrades from Marketplace — is documented in the ACP product documentation. Complete those steps before you start the Kubernetes step on this page:

Use this page when the same cluster runs on a physical-host immutable operating system, because the Kubernetes step on bare-metal replaces every node from a new elemental upgrade image rather than upgrading binaries in place.

Key Considerations

Bare-metal upgrades replace every node — they do not run kubeadm upgrade on the existing OS. The mechanism is:

  1. CAPI deletes one Machine according to the rollout strategy.
  2. The provider writes a clean plan to the inventory's plan secret.
  3. The host stops kubelet + CRI workload + containerd, then returns to Available in the same pool.
  4. CAPI creates a replacement Machine.
  5. The provider picks an Available inventory from the same pool, resolves the new Kubernetes version against elemental-image-catalog, and writes a reprovision plan.
  6. The host runs cloud-init cleanelemental upgrade --system <new-image> → reboot. initramfs clears Kubernetes persistent state; cloud-init re-executes and runs kubeadm init / kubeadm join against the new control plane.

Two structural consequences operators must internalize before starting:

  • Bare-metal does not preserve Kubernetes-managed disk state. /var/lib/kubelet, /var/lib/containerd, /var/lib/etcd, /etc/kubernetes are cleared by the initramfs cleanup step of every reprovision. There is no equivalent of the DCS provider's pool-managed persistent disks today.
  • The same MachineInventory may not be re-picked. The provider does not guarantee that the inventory released by a clean plan is the same one re-allocated to the replacement Machine. Replacements are performed as delete-then-add when maxSurge=0 (no overlap between old and new node), so pool capacity sizing need only accommodate the desired replica count — old and new nodes do not coexist during the rollout.

Upgrade Sequence

Upgrade bare-metal clusters in the following order:

  1. (Prerequisite) Upgrade the ACP platform on the global cluster. This brings the bare-metal provider, elemental-operator, and the related CAPI components to versions that understand the new schema. Trigger workload-cluster upgrades only after the management-side controllers have rolled out and become Ready.
  2. Upgrade the Distribution Version (Aligned Extensions) on the workload cluster. See Upgrading Distribution Version.
  3. Ensure the target Kubernetes version is present in elemental-image-catalog (and the matching -iso repository is available for any future host re-registration).
  4. Upgrade the control plane Kubernetes version (replaces all control-plane nodes one at a time).
  5. Upgrade worker nodes to the target Kubernetes version (replaces all worker nodes within the maxUnavailable budget).

Cluster API orchestrates the rolling replacement.

WARNING

Skipping step 1 risks two failure modes: the old provider silently ignores new schema fields; or a controller image swap mid-rollout interrupts the plan secret state machine. Always settle the management-side upgrade before touching workload rollout.

Prerequisites

Before you start, ensure all of the following prerequisites are met:

  • The Distribution Version upgrade is complete.
  • The control plane is reachable through the existing <control-plane-vip>:<control-plane-port>.
  • All current nodes are healthy and Ready.
  • The target Kubernetes version is a key in elemental-image-catalog. If it is not, add it before starting (see Update the Image Catalog).
  • The platform registry is reachable from every host in the cluster.
  • Both KubeadmControlPlane.spec.rolloutStrategy.rollingUpdate.maxSurge = 0 and MachineDeployment.spec.strategy.rollingUpdate.maxSurge = 0 are set — bare-metal does not over-provision physical hosts.
  • The relevant pools have enough capacity to replace one node at a time without falling below the desired replica count.

Update the Image Catalog

elemental-image-catalog is the resource that introduces a Kubernetes version to the bare-metal provider.

INFO

In most upgrades you do not edit this ConfigMap by hand. The bare-metal provider plugin re-renders elemental-image-catalog with the Kubernetes versions shipped by the new distribution when it is reapplied during the Distribution Version upgrade (Phase 1). Your task is normally just to verify that the target version is present (see the verification step below). The two options that follow are only needed when you must add an out-of-band version — for example a digest-pinned or test build that the plugin does not ship.

Option A — Add a chart override. Append the new version under provider.imageCatalog.images (uses global.registry.address as the registry):

provider:
  imageCatalog:
    images:
      v1.33.7-2:
        repository: tkestack/baremetal-base-image
        tag: v0.0.0-beta-1.33.7-2
      <new-kubernetes-version>:                # for example, v1.34.5
        repository: tkestack/baremetal-base-image
        tag: <new-tag>                         # for example, v0.0.0-dev.0-1.34.5

Reapply the bare-metal provider plugin. The chart re-renders the ConfigMap; the provider's in-process watch hot-reloads the cache without restarting.

Option B — Patch the ConfigMap directly. Useful for digest-pinned images or out-of-band test versions:

kubectl -n cpaas-system patch configmap elemental-image-catalog \
  --type='json' \
  -p='[{"op":"add","path":"/data/<new-kubernetes-version>","value":"<registry-address>/tkestack/baremetal-base-image:<new-tag>"}]'

In either case, verify before continuing:

kubectl -n cpaas-system get configmap elemental-image-catalog -o yaml

The key must include the leading v (for example v1.34.5). If the key is missing when CAPI creates the replacement Machine, the resulting BaremetalMachine enters Failed with ImageResolved=False / Reason=ImageCatalogMiss and no reprovision plan is written — adding the key later restarts the reconciliation automatically.

When the new version corresponds to a new MicroOS / base-image release, the ISO variant for that release should also be available before the upgrade. The bare-metal provider does not rebuild SeedImages automatically, but you may want to refresh MachineRegistration / SeedImage for future host onboarding so the new hosts come up on the new release. The ISO repository is the same as the base-image repository with -iso appended.

Upgrade the Control Plane

Patch KubeadmControlPlane.spec.version to the new Kubernetes version. Where component image tags are pinned (DNS, etcd), update them in the same edit:

kubectl -n cpaas-system edit kubeadmcontrolplane <cluster-name>-control-plane
  • spec.version ← target Kubernetes version (must match an elemental-image-catalog key).
  • spec.kubeadmConfigSpec.clusterConfiguration.dns.imageTag ← matching CoreDNS image tag for the new release.
  • spec.kubeadmConfigSpec.clusterConfiguration.etcd.local.imageTag ← matching etcd image tag for the new release.

The bare-metal provider does not require a new BaremetalMachineTemplate for a Kubernetes-only upgrade: the template only carries the pool reference, and the upgrade image is resolved from Machine.spec.version through the catalog. A new template is only required when you want to move the control plane to a different pool.

Cluster API rolls control-plane nodes one at a time (because maxSurge=0):

kubectl -n cpaas-system get kubeadmcontrolplane <cluster-name>-control-plane -w
kubectl -n cpaas-system get baremetalmachines.infrastructure.cluster.x-k8s.io
kubectl -n cpaas-system get machineinventorypools.infrastructure.cluster.x-k8s.io <cluster-name>-control-plane-pool
kubectl get nodes -o wide                                                    # workload cluster

The expected sequence on each node:

  1. CAPI marks one old Machine for deletion; corresponding BaremetalMachine moves to Preparing.
  2. The provider writes a clean plan; the inventory ends up Available with cleared owner annotations.
  3. CAPI creates a new Machine (with the target version); a new BaremetalMachine is created.
  4. The provider allocates an Available inventory from the control-plane pool, resolves the new image, and writes a reprovision plan.
  5. The host runs elemental upgrade --system <new-image>, reboots, and kubeadm joins the surviving control plane. BaremetalMachine.status.phase becomes Running.

Repeat steps 1–5 until every control-plane node has been replaced.

Throughout the rollout, alive continues to manage the VIP. As control-plane membership changes, the bare-metal provider re-renders the alive chart values (the peer list and ipvs.ips) and rolls out the static-pod manifests.

Upgrade Workers

Patch each MachineDeployment.spec.template.spec.version to the new Kubernetes version. CAPI replaces workers within the maxUnavailable budget — the bare-metal provider re-uses the same worker pool and the same image-catalog resolution logic as the control plane.

kubectl -n cpaas-system patch machinedeployment <cluster-name>-workers \
  --type='merge' \
  -p='{"spec":{"template":{"spec":{"version":"<new-kubernetes-version>"}}}}'

Watch:

kubectl -n cpaas-system get machinedeployments.cluster.x-k8s.io -w
kubectl -n cpaas-system get baremetalmachines.infrastructure.cluster.x-k8s.io
kubectl get nodes -o wide

Worker upgrades carry fewer knobs than the control plane: there is no clusterConfiguration block on MachineDeployment, so no DNS / etcd tags to update. Kubernetes component versions on worker nodes follow the new base image.

If the new release also requires a bootstrap-template change (different cloud-init, different kubeletExtraArgs), follow Updating Bootstrap Templates — that is a separate template swap, independent of the version bump.

Cross-Version Upgrades

Kubernetes only supports single-minor upgrades for the control plane (skew policy). To move from v1.32 → v1.34:

  1. Add the v1.33 entry to elemental-image-catalog and upgrade the control plane to v1.33; wait for the rollout to complete.
  2. Add the v1.34 entry; upgrade the control plane to v1.34.
  3. Upgrade the worker MachineDeployment to v1.34.

The bare-metal rollout strategy already serializes node replacement, so cross-version upgrades do not require additional pool capacity beyond what was already needed for a same-minor upgrade.

Verification

After the rollout completes:

kubectl -n cpaas-system get baremetalmachines.infrastructure.cluster.x-k8s.io
kubectl -n cpaas-system get machines.cluster.x-k8s.io
kubectl get nodes -o wide

All BaremetalMachine objects should be Running. Every CAPI Machine should report the new spec.version. Every workload Node should be Ready with the new kubelet version, and every MachineInventory plan secret should carry baremetal.alauda.io/plan.type=reprovision (the most recent plan applied).

MachineInventoryPool.status should satisfy available + allocated + preparing + reprovisioning + unavailable = total with preparing = reprovisioning = 0.

Troubleshooting

IssueWhat to check
Rollout stuck on the first new nodeBaremetalMachine.status.conditions[ImageResolved] — confirm the new key is in elemental-image-catalog and that the Cluster carries cpaas.io/registry-address.
Host reboots into a state that never joins the clusterMachineInventory.status.plan.state (Failed is terminal); inspect the host's serial console for elemental upgrade errors. Verify the platform registry is reachable from the host.
New control-plane node fails kubeadm joinConfirm the VIP is still reachable; alive may have lost the lock. Check kubectl -n kube-system get lease, ipvsadm -Ln, keepalived logs from the static pod.
Worker rollout stalls with MachineDeployment.spec.strategy.rollingUpdate.maxSurge > 0 setBare-metal pools cannot honour over-provisioning. Set maxSurge=0 and bump maxUnavailable to budget more parallel replacements within the pool.
Pool capacity exhausted mid-rolloutMachineInventoryPool.status.available=0 and allocated still includes the not-yet-replaced nodes. Add another MachineInventory to the pool to free the queue, or wait for in-flight Preparing nodes to drain.

For the full operator-side state machine reference (every condition reason and recovery action), see Provider Overview.


Additional Resources