Bare Metal Provider
TOC
OverviewStatusKey FeaturesDifferences from VM-Based ProvidersConcepts and TerminologyObject hierarchyBare-metal conceptsMachineRegistrationSeedImageMachineInventoryMachineInventoryPoolBaremetalClusterBaremetalMachineTemplateBaremetalMachineImage catalogclean and reprovision plansalive (control-plane HA)MachineInventory.spec.observedNetworkAPI GroupSupported Kubernetes VersionsRequirementsDocumentationOverview
The Bare Metal Infrastructure Provider enables Immutable Infrastructure on physical servers, with no virtualization layer in between. It composes two long-running components on the global cluster:
elemental-operator— registers physical hosts, builds installation ISOs (SeedImage), and maintains the long-livedMachineInventoryobject for every host.elemental-system-agentruns on each host and executesMachineInventoryplan secrets.cluster-api-provider-baremetal— the Cluster API infrastructure provider. It groups availableMachineInventoryobjects into pools, binds eachMachineto aMachineInventory, and writesclean/reprovisionplans that drive the host through Kubernetes node lifecycle.
Unlike VM providers (DCS, vSphere), the bare-metal provider does not create or destroy machines. A host is installed once (live ISO → on-disk OS via elemental install), registers itself as a MachineInventory, and stays in the inventory across the cluster lifecycle. Node "creation" and "deletion" are realized through elemental upgrade driven by plans, followed by reboot and cloud-init re-execution.
Status
The provider currently follows a YAML-only workflow. There is no Fleet Essentials UI for bare-metal clusters yet — every step on this page is driven by kubectl apply.
Key Features
MachineInventoryPoolallocation model — operators pre-declare whichMachineInventoryobjects may back aKubeadmControlPlaneorMachineDeployment. The provider picks anAvailableinventory from that pool when aMachineis created; nothing is provisioned outside the declared set.- Plan-driven node lifecycle — node attach uses a
reprovisionplan (write cloud-init,cloud-init clean,elemental upgrade, reboot, cloud-init re-execute,kubeadm init/join). Node detach uses acleanplan (stop kubelet, clear CRI workload, stop containerd).MachineInventoryis never deleted by the provider during scaling, upgrade, or cluster deletion. - Cluster API native object tree — uses upstream
Cluster,KubeadmControlPlane,MachineDeployment,Machine. The provider only owns the infrastructure tree (BaremetalCluster,BaremetalMachine,BaremetalMachineTemplate,MachineInventoryPool). There is no custom control-plane CRD. - Image-catalog driven Kubernetes versions — the provider keeps a cluster-scoped
elemental-image-catalogConfigMap that mapsMachine.spec.versionto anelemental upgradeimage. Upgrades become "patch the version, controller resolves the matching image, reprovisions the node." - Control-plane HA via
alive—BaremetalCluster.spec.controlPlaneLoadBalancerdeclares a control-plane VIP and VRID; the provider deploys thealivechart (keepalived + IPVS +kube-lockLease arbitration) onto the control-plane nodes once the workload cluster is reachable. - Network identity preservation —
elemental-registerreports the live-ISO observed network asMachineInventory.spec.observedNetwork. The firstelemental installand every laterreprovisionplan replay that snapshot as cloud-initnetwork-config v2, so a host keeps its address, default route, and DNS across the entire lifecycle.
Differences from VM-Based Providers
Concepts and Terminology
Object hierarchy
elemental-operator owns the left column (host registration and the long-lived MachineInventory). cluster-api-provider-baremetal owns the right column (the Cluster API infrastructure tree) and only references MachineInventory by name.
Bare-metal concepts
MachineRegistration
Declares the registration endpoint and first-install cloud-config that elemental-register consumes on the live ISO. Operators set machineName, machineInventoryLabels, and machineInventoryAnnotations (with ${SMBIOS/...} templating) plus the elemental.install and elemental.registration blocks. MachineRegistration is queried but not modified by the bare-metal provider — it belongs to the elemental layer.
SeedImage
Triggers elemental-operator to build a bootable ISO that contains the registration URL and the operator's TLS material baked into it. spec.baseImage must reference the ISO variant of the OS image that matches the target Kubernetes version (the repository name carries the -iso suffix; the tag/digest matches the elemental-image-catalog entry for the version you intend to install). The ISO is booted once per physical host; on boot, the host runs elemental-register followed by elemental install and creates a MachineInventory.
MachineInventory
The long-lived host identity object. The provider only relies on the following parts of its contract:
status.plan.secretRef— the single default plan secret owned byelemental-operator.status.plan.state—Applied/Failedtransitions used to drive theBaremetalMachinestate machine.status.conditions— host-side readiness signals.spec.observedNetwork— fork-only field populated byelemental-registerfrom the live-ISO NICs; replayed during install and during everyreprovisionplan.
The provider never deletes a MachineInventory, never uses MachineInventorySelector, and does not run a separate MachineInventoryLifecycleController — that lifecycle remains with elemental-operator.
MachineInventoryPool
Operator-authored set of MachineInventory names that a given BaremetalMachineTemplate is allowed to draw from. Pools are scoped to a single clusterName, and each MachineInventory belongs to at most one active pool at a time. The pool reconciler aggregates the pool-wide capacity counters used everywhere in the docs:
available— free for allocationallocated— bound to an activeBaremetalMachinepreparing— running acleanplanreprovisioning— running areprovisionplanunavailable— Ready=False, plan failed, missing plan secret, or not present in the cluster
BaremetalCluster
The Cluster API infrastructure cluster resource. Owns controlPlaneLoadBalancer (the VIP and vrid consumed by the alive chart) and controlPlaneEndpoint (backfilled from controlPlaneLoadBalancer when only the VIP is set). The reconciler defers cluster-addon deployment (alive, kube-ovn) until the workload control plane is reachable.
BaremetalMachineTemplate
The Cluster API infrastructure template referenced by KubeadmControlPlane.spec.machineTemplate.infrastructureRef and by MachineDeployment.spec.template.spec.infrastructureRef. Templates only carry machineInventoryPoolRef (which pool this machine group draws from) and allocationPolicy (Ordered is currently the only supported value — picks the first Available inventory in declaration order).
There is deliberately no version, role, or upgradeImage on the template. Role comes from the owning Cluster API resource; the version comes from Machine.spec.version; the upgrade image is resolved at reprovision time from the global image catalog.
BaremetalMachine
The Cluster API infrastructure machine. Reconciles a single Machine against a single MachineInventory:
- Picks an
Availableinventory from the pool referenced by the owning template. - Reads the owning
Machine.spec.bootstrap.dataSecretNameand resolves the elemental upgrade image forMachine.spec.versionfrom the image catalog. - Normalizes the bootstrap user-data (hostname, kubelet
provider-id,criSocket) and writes thereprovisionplan into theMachineInventoryplan secret. - Watches
MachineInventory.status.plan.stateuntil the plan reportsApplied, then setsBaremetalMachine.status.providerID = baremetal:///<inventory-name>. - On deletion, writes a
cleanplan and clears the owner annotations once the plan applies, returning the inventory to the pool.
Phase transitions: Pending → Allocated → Reprovisioning → Running; deletion: Running → Preparing → Deleted; failure: * → Failed.
Image catalog
A cluster-scoped ConfigMap (default name elemental-image-catalog in cpaas-system) that maps Machine.spec.version to an elemental upgrade image. The bare-metal provider chart renders this ConfigMap from provider.imageCatalog.images (global.registry.address is prepended to each repository) and from provider.imageCatalog.data (for fully-qualified overrides such as digest-pinned images). The reconciler hot-reloads the ConfigMap; a missing key is a terminal Failed state, not a fallback to a default image.
The image catalog also drives SeedImage.spec.baseImage: the ISO variant is the same repository with -iso appended and the same tag/digest as the catalog entry.
clean and reprovision plans
The only two plan types the provider writes into MachineInventory.spec.plan (annotated with baremetal.alauda.io/plan.type=clean|reprovision):
reprovision— runs on node attach. Writes NoClouduser-data/meta-data(and, whenMachineInventory.spec.observedNetworkis non-empty,network-configv2), writes a cleanup marker, runscloud-init clean --logs --seed, runselemental upgrade --reboot=false --system <image>, and triggers a delayed reboot. After reboot,initramfsclears/var/lib/kubelet,/var/lib/containerd,/var/lib/etcd,/etc/kubernetes; cloud-init re-runs and performskubeadm init/kubeadm join.clean— runs on node detach. Stopskubelet, clears CRI workload, stopscontainerd. It explicitly does not runkubeadm reset,cloud-init clean, orelemental upgrade, and it does not reboot. Real cleanup is deferred to thereprovisionplan that runs when the host is re-allocated.
The provider applies the plans through the upstream "single default plan secret" semantics; it does not use MachineInventorySelector or FleetBundle.
alive (control-plane HA)
alive is a set of static pods (keepalived + IPVS) plus a kube-lock Lease arbitrator that maintain the control-plane VIP described in BaremetalCluster.spec.controlPlaneLoadBalancer. The provider deploys alive as an AppRelease on the workload cluster once the first control-plane Node is Ready, and re-renders it whenever the control-plane membership changes. During the very first kubeadm init, the provider prepends a one-off ip addr add <vip>/32 dev eth0 command to the bootstrap so that the first node holds the VIP until alive takes over.
The VIP must live in the control-plane nodes' Layer-2 broadcast domain; vrid must be unique within that domain.
MachineInventory.spec.observedNetwork
Fork-only field populated by elemental-register from the live-ISO NIC state and reported back via the MsgObservedNetworkConfig registration message. It is consumed in two places:
- During the first
elemental install, whenMachineInventory.spec.networkis empty, the registration server falls back to translatingobservedNetworkinto annmconnectionsNetworkConfigso the on-disk OS keeps the same address that the live ISO had. - During every
reprovision, the provider translatesobservedNetworkinto cloud-initnetwork-configv2 (a netplan subset) and writes it as the third NoCloud seed file. Explicitspec.networktakes precedence in both cases.
API Group
All bare-metal infrastructure resources belong to infrastructure.cluster.x-k8s.io/v1beta1. Elemental resources belong to elemental.cattle.io/v1beta1.
Supported Kubernetes Versions
The bare-metal provider supports the Kubernetes versions listed in its elemental-image-catalog. The default chart values ship two entries (the v1.33.7-2 and v1.34.5 baremetal-base-image releases); additional versions are introduced by appending entries to provider.imageCatalog.images (renders to global.registry.address/<repository>:<tag>) or by overriding with a full image reference under provider.imageCatalog.data. See OS Support Matrix for the matching component versions.
Requirements
- Physical hosts (or PXE-bootable VMs for lab use) with BIOS or UEFI access to mount the
SeedImageISO. - A platform registry reachable from each host (used by both
elemental installandelemental upgrade). Setglobal.registry.addressand, when the registry is self-signed, leaveglobal.registry.tlsVerify=false(chart default). - A control-plane VIP, free port (typically
6443), and avridunique within the control-plane Layer-2 broadcast domain. - One
MachineInventoryPoolper role (control plane, worker) sized to at least the target replica count plus headroom for upgrades. - TPM available on production hosts. Lab and PoC hosts can keep
MachineRegistration.spec.config.elemental.registration.emulate-tpm: trueto bypass real TPM hardware.
Documentation
For detailed instructions on using the bare-metal provider, see: