Managing Nodes on Huawei Cloud Stack

This document explains how to manage worker nodes using Cluster API Machine resources on the Huawei Cloud Stack platform.

Prerequisites Overview Worker Node Deployment Step 1: Configure Machine Configuration Pool Pool-Managed Persistent Disks for Workers Step 2: Configure Machine Template Step 3: Configure Bootstrap Template Step 4: Configure Machine Deployment Node Management Operations Scaling Worker Nodes Adding Worker Nodes Removing Worker Nodes Upgrading Machine Infrastructure Upgrading Kubernetes Version Verification Troubleshooting Viewing Controller Logs Common Issues

Prerequisites

WARNING

Important Prerequisites

The control plane must be deployed before performing node operations. See Create Cluster for setup instructions.
Ensure you have proper access to the HCS platform and required permissions.

When using the YAML examples in this document, replace only values enclosed in <> with environment-specific values. Preserve the remaining fields unless your cluster policy requires a different value.

Overview

Worker nodes are managed through Cluster API Machine resources, providing declarative and automated node lifecycle management. The deployment process involves:

Machine Configuration Pool - Network settings for worker nodes
Machine Template - VM specifications
Bootstrap Configuration - Node initialization settings
Machine Deployment - Orchestration of node creation and management

Worker Node Deployment

Before you prepare worker YAML, complete the HCS input checklist in Infrastructure Resources for Huawei Cloud Stack. In particular, list every worker subnet in HCSCluster.spec.network.subnets, allocate worker IPs from planned free IP ranges, and collect the provider-recognized flavorName and availabilityZone API values. If you add a new worker subnet to an existing Ready cluster, patch HCSCluster.spec.network.subnets with the full subnet object instead of adding only the subnet name.

Step 1: Configure Machine Configuration Pool

The HCSMachineConfigPool defines the network configuration and any pool-managed persistent disks for worker node VMs. You must plan and configure the IP addresses, hostnames, persistent disk slots, and other parameters before deployment.

WARNING

Pool Size Requirement

The pool must include at least as many entries as the number of worker nodes you plan to deploy. Insufficient entries will prevent node deployment.

Use one subnet selector per networks[] entry. For new manifests, set either subnetName or subnetId, but not both. Existing manifests may keep the deprecated subenetName field; if you also add subnetName while updating that manifest, its value must exactly match subenetName. Do not supply conflicting values across subenetName, subnetName, and subnetId.

If you use subnetName for worker nodes, include the same subnet name in the parent HCSCluster.spec.network.subnets list before you create or scale the worker pool. For an existing Ready cluster, append the full subnet object, including the subnet ID, instead of adding only the subnet name.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: HCSMachineConfigPool
metadata:
  name: <cluster-name>-worker-pool
  namespace: cpaas-system
  labels:
    cluster.x-k8s.io/cluster-name: <cluster-name>
spec:
  configs:
    - hostname: <worker-1-hostname>
      networks:
        - subnetName: <subnet-name>
          ipAddress: <worker-1-ip>
      persistentDisks:
        - slot: 0
          size: 100
          type: SSD
          mountPath: /var/cpaas
          format: xfs
    - hostname: <worker-2-hostname>
      networks:
        - subnetName: <subnet-name>
          ipAddress: <worker-2-ip>
      persistentDisks:
        - slot: 0
          size: 100
          type: SSD
          mountPath: /var/cpaas
          format: xfs
    - hostname: <worker-3-hostname>
      networks:
        - subnetName: <subnet-name>
          ipAddress: <worker-3-ip>
      persistentDisks:
        - slot: 0
          size: 100
          type: SSD
          mountPath: /var/cpaas
          format: xfs

Parameter	Type	Required	Description
`.spec.configs[]`	array	Yes	Non-empty list of worker node configurations
`.spec.configs[].hostname`	string	Yes	VM hostname. Use lowercase letters, numbers, hyphens (`-`), or dots (`.`); the value must start and end with a lowercase letter or number and must not exceed 253 characters
`.spec.configs[].networks[]`	array	Yes	Non-empty list of network configurations for the VM
`.spec.configs[].networks[].subnetName`	string	No*	Recommended subnet name field for new manifests
`.spec.configs[].networks[].subnetId`	string	No*	Subnet ID. Use this field instead of `subnetName` when the subnet name is ambiguous
`.spec.configs[].networks[].ipAddress`	string	Yes	Static IP address for the worker VM
`.spec.configs[].persistentDisks[]`	array	No	EVS disks that survive HCSMachine delete-recreate replacement
`.spec.configs[].persistentDisks[].slot`	int	Yes*	Disk slot within one machine configuration. Slots must be unique and contiguous from `0` for the same hostname
`.spec.configs[].persistentDisks[].size`	int	Yes*	EVS disk size in GB. For newly created EVS data disks, use `10` to `32768` GB. Existing claimed disks must match their current size
`.spec.configs[].persistentDisks[].type`	string	Yes*	EVS disk type name available in the target availability zone
`.spec.configs[].persistentDisks[].mountPath`	string	No	Guest mount path. Use `/var/cpaas` for platform state that must survive VM replacement
`.spec.configs[].persistentDisks[].format`	string	No	File system format. If omitted, the provider uses `xfs`
`.spec.configs[].persistentDisks[].mountOptions`	array	No	Mount options. If omitted, the provider uses `defaults,noatime`

*For new manifests, set either subnetName or subnetId. Existing manifests may continue to use subenetName, and may add subnetName only if both fields use the same value. Do not provide conflicting subnet selector values.

Persistent disk fields are required when persistentDisks is specified.

Use persistentDisks[] for node-local state that must survive worker replacement. Do not declare the same mount path in HCSMachineTemplate.spec.template.spec.dataVolumes[].

Note: The CRD schema lists subnetName, subenetName, and subnetId as optional fields and does not express their allowed combinations. Follow the provider-level rules above when writing manifests.

Note: networks[] can contain more than one entry when a worker node needs multiple NICs. The current provider only uses each entry to attach a NIC with a subnet selector and static IP. It does not support per-NIC role declarations, default gateway selection, static routes, route metrics, or per-NIC DNS settings.

Pool-Managed Persistent Disks for Workers

Declare worker-node disks that must survive replacement in the matching HCSMachineConfigPool.spec.configs[].persistentDisks[] entry. Use this model for /var/cpaas and for any other node-local state that must be retained during rolling replacement.

Keep HCSMachineTemplate.spec.template.spec.dataVolumes[] for temporary disks that may be recreated with each ECS.
Keep slots unique and contiguous from 0 for each hostname. The provider uses (hostname, slot) as the persistent-disk identity.
Treat slot, size, type, format, and mountPath as immutable after the provider accepts the entry.
You can update mountOptions. The change takes effect after the worker is replaced.
You can append new persistentDisks[] entries. The provider creates or claims the EVS disk, but it does not hot-mount the disk into the running ECS. Trigger a rolling replacement with MachineDeployment.spec.strategy.rollingUpdate.maxSurge: 0 before you expect the new disk to be formatted and mounted inside the guest OS.

To inspect persistent-disk runtime state during worker operations, check the pool status:

kubectl get hcsmachineconfigpool <cluster-name>-worker-pool -n cpaas-system -o yaml

Step 2: Configure Machine Template

The HCSMachineTemplate defines the VM specifications for worker nodes.

Configure worker nodes with a system volume and temporary data volumes for paths that may be recreated with each ECS, such as /var/lib/kubelet and /var/lib/containerd. Put /var/cpaas in HCSMachineConfigPool.spec.configs[].persistentDisks[] when platform state must survive worker replacement.

Use the provider-recognized flavorName and availabilityZone API values when you prepare the worker template. These values are not the tenant UI display names.

apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: HCSMachineTemplate
metadata:
  name: <cluster-name>-worker-template
  namespace: cpaas-system
spec:
  template:
    spec:
      imageName: <vm-image-name>
      flavorName: <instance-flavor>
      availabilityZone: <availability-zone>
      rootVolume:
        type: SSD
        size: 100
      configPoolRef:
        name: <cluster-name>-worker-pool
      dataVolumes:
        - size: 20
          type: SSD
          mountPath: /var/lib/kubelet
          format: xfs
        - size: 20
          type: SSD
          mountPath: /var/lib/containerd
          format: xfs

Parameter	Type	Required	Description
`.spec.template.spec.imageName`	string	Yes	VM image name
`.spec.template.spec.flavorName`	string	Yes	Provider-recognized HCS API value matched against `Flavor.Name`
`.spec.template.spec.availabilityZone`	string	No	Provider-recognized HCS API value matched against `ZoneName`
`.spec.template.spec.rootVolume.type`	string	Yes	Volume type
`.spec.template.spec.rootVolume.size`	int	Yes	System disk size in GB
`.spec.template.spec.configPoolRef.name`	string	Yes	Referenced HCSMachineConfigPool name
`.spec.template.spec.dataVolumes[]`	array	No	Data volume configurations
`.spec.template.spec.dataVolumes[].size`	int	Yes*	Disk size in GB
`.spec.template.spec.dataVolumes[].type`	string	Yes*	Volume type
`.spec.template.spec.dataVolumes[].mountPath`	string	Yes*	Mount path
`.spec.template.spec.dataVolumes[].format`	string	Yes*	File system format

*Required when dataVolumes is specified.

dataVolumes[] are recreated with the ECS. Do not use them for /var/cpaas or any other path that must survive rolling replacement.

Note: Do not set runtime identity fields such as providerID or serverId in HCSMachineTemplate manifests. The provider assigns these values when it creates HCS instances.

Note: Tenant administrators cannot retrieve the provider-recognized flavorName and availabilityZone values from the HCS UI. Get the exact values from the HCS administrator before you apply the manifest.

Step 3: Configure Bootstrap Template

The KubeadmConfigTemplate defines the bootstrap configuration for worker nodes.

apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  name: <cluster-name>-worker-kct
  namespace: cpaas-system
spec:
  template:
    spec:
      files:
        - path: /etc/kubernetes/patches/kubeletconfiguration0+strategic.json
          owner: root:root
          permissions: "0644"
          content: |
            {
              "apiVersion": "kubelet.config.k8s.io/v1beta1",
              "kind": "KubeletConfiguration",
              "protectKernelDefaults": true,
              "staticPodPath": null,
              "tlsCertFile": "/etc/kubernetes/pki/kubelet.crt",
              "tlsPrivateKeyFile": "/etc/kubernetes/pki/kubelet.key",
              "streamingConnectionIdleTimeout": "5m",
              "clientCAFile": "/etc/kubernetes/pki/ca.crt"
            }
      postKubeadmCommands:
        - chmod 600 /var/lib/kubelet/config.yaml
      joinConfiguration:
        patches:
          directory: /etc/kubernetes/patches
        nodeRegistration:
          kubeletExtraArgs:
            volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"

The HCS controller injects /etc/kubernetes/pki/kubelet.crt and /etc/kubernetes/pki/kubelet.key while resolving worker cloud-init data. The kubelet patch above configures kubelet to use those controller-provided certificate files.

Step 4: Configure Machine Deployment

The MachineDeployment orchestrates the creation and management of worker nodes.

apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: <cluster-name>-md-0
  namespace: cpaas-system
spec:
  clusterName: <cluster-name>
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
  template:
    spec:
      clusterName: <cluster-name>
      version: <kubernetes-version>
      nodeDrainTimeout: 1m
      nodeDeletionTimeout: 5m
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: <cluster-name>-worker-kct
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: HCSMachineTemplate
        name: <cluster-name>-worker-template

Parameter	Type	Required	Description
`.spec.clusterName`	string	Yes	Target cluster name
`.spec.replicas`	int	Yes	Number of worker nodes
`.spec.template.spec.bootstrap.configRef`	object	Yes	Reference to KubeadmConfigTemplate
`.spec.template.spec.infrastructureRef`	object	Yes	Reference to HCSMachineTemplate
`.spec.template.spec.version`	string	Yes	Kubernetes version
`.spec.strategy.rollingUpdate.maxSurge`	int	No	Maximum nodes above desired during update
`.spec.strategy.rollingUpdate.maxUnavailable`	int	No	Maximum unavailable nodes during update

Node Management Operations

This section covers common operational tasks for managing worker nodes.

Scaling Worker Nodes

Worker node scaling allows you to adjust cluster capacity based on workload demands.

Adding Worker Nodes

Increase the number of worker nodes to handle increased workload.

Procedure:

Check Current Node Status

# List all machines in the cluster
kubectl get machines -n cpaas-system

# List machines for a specific MachineDeployment
kubectl get machines -n cpaas-system -l cluster.x-k8s.io/deployment-name=<cluster-name>-md-0

Extend Configuration Pool

Add new machine configurations to the pool for the additional nodes. If the new workers need preserved node-local state such as /var/cpaas, include the matching persistentDisks[] entries in each new configuration.
kubectl get hcsmachineconfigpool <cluster-name>-worker-pool -n cpaas-system -o yaml
Modify the pool to include new IP entries, then apply:
kubectl apply -f <updated-pool-config.yaml>
When you edit the pool, keep all existing configs[] entries and their accepted persistentDisks[] entries unchanged unless you are intentionally appending a new disk slot.

Scale Up the MachineDeployment

Update the replicas field to the desired number of nodes:

kubectl patch machinedeployment <cluster-name>-md-0 -n cpaas-system \
  --type='json' -p='[{"op": "replace", "path": "/spec/replicas", "value": <new-replica-count>}]'

Monitor the Scaling Progress

# Watch machines being created
kubectl get machines -n cpaas-system -w

# Check MachineDeployment status
kubectl get machinedeployment <cluster-name>-md-0 -n cpaas-system

Removing Worker Nodes

Decrease the number of worker nodes to reduce cluster capacity.

WARNING

Data Loss Warning

Scaling down removes worker nodes and their ECS instances. Template-owned dataVolumes[] are not preserved. Pool-managed persistent disks declared in HCSMachineConfigPool.spec.configs[].persistentDisks[] remain tracked by the pool and can be reused while the corresponding hostname entry stays in the pool. Ensure:

Workloads can tolerate node loss through proper replication
No critical data is stored only on the nodes being removed
Applications are designed for horizontal scaling

Procedure:

Scale Down the MachineDeployment

kubectl patch machinedeployment <cluster-name>-md-0 -n cpaas-system \
  --type='json' -p='[{"op": "replace", "path": "/spec/replicas", "value": <new-replica-count>}]'

Monitor the Removal Progress
kubectl get machines -n cpaas-system -w
The Cluster API controller will:
- Drain the selected nodes (evict pods if possible)
- Delete the underlying VMs from the HCS platform
- Remove the machine resources

Upgrading Machine Infrastructure

To upgrade worker machine specifications (CPU, memory, disk, VM image), follow these steps:

Note: Worker infrastructure upgrades rely on Cluster API rolling replacement. HCS dataVolumes[] are not preserved during replacement. To preserve node-local state such as /var/cpaas, declare it in HCSMachineConfigPool.spec.configs[].persistentDisks[] before the rollout and keep MachineDeployment.spec.strategy.rollingUpdate.maxSurge: 0.

Create New Machine Template

Copy the existing HCSMachineTemplate and modify the required values:
- imageName - VM image
- flavorName - Instance type
- rootVolume.size - System disk size
- dataVolumes - Temporary data disk configurations
If you need to add a new pool-managed persistent disk, append it to the worker HCSMachineConfigPool first. The provider creates or claims the EVS disk, but the running ECS does not mount it until this rolling replacement creates a replacement worker.
kubectl get hcsmachinetemplate <current-template> -n cpaas-system -o yaml > new-template.yaml
Then edit new-template.yaml before applying:
- Change metadata.name to <new-template>
- Leave runtime identity fields unset, including spec.template.spec.providerID and spec.template.spec.serverId
- Remove server-generated fields such as:
  - metadata.resourceVersion
  - metadata.uid
  - metadata.creationTimestamp
  - metadata.managedFields
  - status

Deploy New Template

kubectl apply -f new-template.yaml -n cpaas-system

Update Machine Deployment

Modify the MachineDeployment to reference the new template:

kubectl patch machinedeployment <cluster-name>-md-0 -n cpaas-system \
  --type='merge' -p='{"spec":{"template":{"spec":{"infrastructureRef":{"name":"<new-template>"}}}}}'

Monitor Rolling Update
kubectl get machines -n cpaas-system -w

Upgrading Kubernetes Version

Kubernetes version upgrades require coordinated updates to both the MachineDeployment and the underlying VM template.

Note: Ensure the VM template's Kubernetes version matches the version specified in the MachineDeployment. Mismatched versions will cause node join failures.

Procedure:

Update Machine Template

Create a new HCSMachineTemplate with an updated imageName that supports the target Kubernetes version.

Update MachineDeployment

Modify the following fields:

spec.template.spec.version - Target Kubernetes version

spec.template.spec.infrastructureRef.name - New machine template name

kubectl patch machinedeployment <cluster-name>-md-0 -n cpaas-system \
  --type='merge' -p='{"spec":{"template":{"spec":{"version":"<kubernetes-version>","infrastructureRef":{"name":"<new-template>"}}}}}'

Monitor Upgrade

Verify that new nodes join the cluster with the correct Kubernetes version:
kubectl get nodes

Verification

After deploying worker nodes, verify the deployment:

# Check machine status
kubectl get machines -n cpaas-system

# Verify nodes are Ready
kubectl get nodes

# Check MachineDeployment status
kubectl get machinedeployment -n cpaas-system

Troubleshooting

Viewing Controller Logs

# View HCS controller logs
kubectl logs -n cpaas-system deployment/hcs-controller-manager

# View machine details
kubectl describe hcsmachine <machine-name> -n cpaas-system

Common Issues

Node fails to join cluster

Verify the VM template matches the Kubernetes version
Check network connectivity between nodes
Ensure the configuration pool has available entries

Machine stuck in provisioning

Check HCS platform for resource availability
Verify credentials and permissions
Review controller logs for error messages

#Managing Nodes on Huawei Cloud Stack

#TOC

#Prerequisites

#Overview

#Worker Node Deployment

#Step 1: Configure Machine Configuration Pool

#Pool-Managed Persistent Disks for Workers

#Step 2: Configure Machine Template

#Step 3: Configure Bootstrap Template

#Step 4: Configure Machine Deployment

#Node Management Operations

#Scaling Worker Nodes

#Adding Worker Nodes

#Removing Worker Nodes

#Upgrading Machine Infrastructure

#Upgrading Kubernetes Version

#Verification

#Troubleshooting

#Viewing Controller Logs

#Common Issues

Managing Nodes on Huawei Cloud Stack

TOC

Prerequisites

Overview

Worker Node Deployment

Step 1: Configure Machine Configuration Pool

Pool-Managed Persistent Disks for Workers

Step 2: Configure Machine Template

Step 3: Configure Bootstrap Template

Step 4: Configure Machine Deployment

Node Management Operations

Scaling Worker Nodes

Adding Worker Nodes

Removing Worker Nodes

Upgrading Machine Infrastructure

Upgrading Kubernetes Version

Verification

Troubleshooting

Viewing Controller Logs

Common Issues