Migrate Existing Huawei Cloud Stack Clusters to Pool-Managed Persistent Disks
Use this guide when you upgrade an existing Huawei Cloud Stack (HCS) cluster from the older HCSMachineTemplate data-volume layout to the pool-managed persistent-disk model.
In HCS provider v1.0.1 or later, disks that must survive node replacement are declared in HCSMachineConfigPool.spec.configs[].persistentDisks[]. This includes the platform-required /var/cpaas disk.
Version
Use this procedure when the cluster runs ACP v4.3.1 or later and the target HCS provider version is v1.0.1 or later.
TOC
OverviewBefore You StartInspect the Current Disk LayoutDecide Which Disks to PreserveCreate New Machine TemplatesPrepare the Rollout StrategyUpdate the Machine Configuration PoolTrigger Rolling ReplacementVerify the MigrationFailure HandlingLimitations and Recovery NotesRelated TopicsOverview
Older HCS clusters commonly placed /var/cpaas in HCSMachineTemplate.spec.template.spec.dataVolumes[]. That layout creates data volumes with the ECS. During rolling replacement, the old ECS and its template-owned data volumes may be deleted together.
The pool-managed model moves upgrade-preserved disks into HCSMachineConfigPool.spec.configs[].persistentDisks[]. Each persistent disk is bound to a fixed (hostname, slot) identity. During rolling replacement, the provider:
- Claims the existing EVS disk from the old ECS when it matches the pool declaration.
- Stops the old ECS.
- Detaches the EVS disk and waits until it is available.
- Deletes the old ECS.
- Creates the replacement ECS with the same EVS disk attached before first boot.
- Boots the replacement ECS, which mounts the existing file system without reformatting it.
Before You Start
Verify all of the following before you begin:
- The management cluster has HCS provider
v1.0.1or later. - The workload cluster is healthy and all nodes are
Ready. - The cluster uses
HCSMachineConfigPoolto assign fixed hostnames and IP addresses. - The preserved disks have non-empty mount paths in the old
HCSMachineTemplate.spec.template.spec.dataVolumes[]. - The relevant rollout strategies use
maxSurge: 0. - You have a maintenance window for one-by-one node replacement.
- You have a verified backup of etcd and platform configuration.
Do not declare the same mount path in both HCSMachineTemplate.spec.template.spec.dataVolumes[] and HCSMachineConfigPool.spec.configs[].persistentDisks[]. The provider rejects this configuration to prevent data loss.
Inspect the Current Disk Layout
Identify the management-cluster objects that control the cluster:
Inspect the current machine templates:
Record every dataVolumes[] entry that must be preserved. For each disk, record:
Decide Which Disks to Preserve
Move only the disks that must survive node replacement to the pool-managed model.
Use the following split:
For automatic migration, the provider claims an existing data volume by matching mountPath. If a preserved disk has no mountPath, automatic claim is not available. Use a supported operational procedure to record the existing EVS volumeID in status.persistentDiskStatus[], or migrate the data to a disk with a declared mount path before you trigger replacement.
Create New Machine Templates
Create new HCSMachineTemplate resources for the replacement nodes. Do not edit the existing templates in place.
Copy the current template:
Edit new-template.yaml:
- Set
metadata.nameto a new template name. - Remove server-generated metadata, such as
resourceVersion,uid,creationTimestamp,managedFields, andstatus. - Leave runtime identity fields unset, including
spec.template.spec.providerIDandspec.template.spec.serverId. - Remove the preserved paths from
spec.template.spec.dataVolumes[]. - Keep only temporary data volumes that may be recreated with each ECS.
- Update
spec.template.spec.imageNameand other upgrade fields when this migration is part of a Kubernetes or image upgrade.
For example, after /var/cpaas moves to the pool, the template keeps temporary disks only:
Apply the new template:
Prepare the Rollout Strategy
Before you update the pool, confirm the rollout strategy for each controller that will use the updated pool. Skip the MachineDeployment command if the cluster has no worker pool in this migration.
Each returned value must be 0 for pools that use persistent disks.
If any returned value is not 0, patch the affected rollout strategy before you update the pool:
Update the Machine Configuration Pool
Edit the HCSMachineConfigPool that is referenced by the old and new HCSMachineTemplate.spec.template.spec.configPoolRef.name.
Add one persistentDisks[] entry under each hostname that must preserve the disk:
Use these rules when you edit the pool:
- Start
slotat0for each hostname. - Keep slots contiguous for each hostname.
- Set
size,type,mountPath, andformatto match the old data volume that you want to claim. - Add
cluster.x-k8s.io/cluster-name: <cluster-name>if the pool does not already have it. - Keep the same persistent-disk declaration across all nodes that must preserve the same path.
Apply the updated pool only after the replacement template exists, has removed the preserved paths from dataVolumes[], and the rollout strategy uses maxSurge: 0:
Trigger Rolling Replacement
After you apply the updated pool, immediately point the control plane or worker controller to the new template in the same maintenance window. Do not leave a pool that declares /var/cpaas persistent disks while the active rollout still points to an old template that also declares /var/cpaas in dataVolumes[].
For a control plane migration, point the KubeadmControlPlane to the new template:
For a worker migration, point the MachineDeployment to the new template:
If the template reference already points to the target template and you need to force a one-by-one replacement, set rolloutAfter:
Verify the Migration
Watch the rolling replacement:
Inspect the pool status:
Each migrated disk appears under status.persistentDiskStatus:
Confirm the replacement node can read data from the preserved path:
For a stronger data-retention check, write a marker before the rollout and read it after the replacement node becomes Ready:
Failure Handling
Use the pool status to decide the next action when a disk enters phase: Error.
Do not delete the old HCSMachine or force-remove finalizers while a persistent disk is in an unresolved error state. The provider blocks deletion to avoid deleting the old ECS before it can safely claim and detach the disk.
Limitations and Recovery Notes
- This procedure applies to clusters that use
HCSMachineConfigPoolwith fixed hostnames and IP addresses. - Pool-managed persistent disks require one-by-one replacement. Keep
maxSurge: 0for each control plane or worker rollout that uses persistent disks. - The provider automatically claims existing data volumes by matching a non-empty
mountPath. dataVolumes[]that are not declared inpersistentDisks[]remain template-owned and may be deleted with the old ECS.- After the provider accepts a persistent disk entry, treat
slot,size,type,format, andmountPathas immutable. mountOptionscan change, but the change takes effect only on a replacement VM.- Single-control-plane HCS clusters are creation-only topologies in the documented upgrade workflow. Do not use this rolling migration procedure for a single-control-plane cluster.