Creating Clusters on Huawei DCS

This document provides instructions for creating Kubernetes clusters on the Huawei DCS platform. YAML-based cluster creation is available through manifests. If Fleet Essentials is installed and Alauda Container Platform DCS Infrastructure Provider is 1.0.13 or later, you can also create clusters through the web UI. If the workflow relies on pool-managed persistent disks, use DCS provider v1.0.16 or later. In v1.0.16, the persistentDisk declaration on DCSIpHostnamePool is available through YAML only and is not exposed in the web UI.

INFO

The web UI provides a guided workflow with validation, while YAML offers more automation flexibility.

Prerequisites

Before creating clusters, ensure all of the following prerequisites are met:

1. Infrastructure Resources

Configure the following infrastructure resources before creating a cluster:

  • Cloud Credential - DCS platform access information
  • IP Pool - Network configuration for cluster nodes and any IP-slot persistent disks such as /var/cpaas
  • Machine Template - VM specifications for control plane and worker nodes, excluding pool-managed persistent disks

See Infrastructure Resources for Huawei DCS for detailed configuration instructions.

2. Required Plugin Installation

Install the following plugins on the global cluster:

  • Alauda Container Platform Kubeadm Provider
  • Alauda Container Platform DCS Infrastructure Provider

For detailed installation instructions, refer to the Installation Guide.

3. Virtual Machine Template Preparation

For Kubernetes installation, you must:

  • Upload the MicroOS image to the DCS platform
  • Create a virtual machine template based on this image
  • Ensure the template includes all necessary Kubernetes components
  • Use DCS VM templates 4.2.1 or later if you plan to use persistent disks, because safe shutdown and disk detach depend on guest tools
  • Use one-by-one replacement for any cluster that will rely on pool-managed persistent disks. Keep maxSurge: 0 on the control plane and on worker node pools.

For details on the Kubernetes components included in each VM image, see OS Support Matrix.

4. Network Connectivity

Global cluster nodes must be able to reach the DCS platform at two distinct destinations:

FromToPortPurpose
Global cluster nodeDCS VRM virtual IPTCP/7443DCS REST API. Covers cluster lifecycle calls and the first step of file uploads (applyUpload).
Global cluster nodeDCS physical host MGMT IP (every host that may receive a clone)DCS-returned port; typically TCP/8443; confirm with the DCS administratorFile-stream upload of the Ignition ISO. The provider streams the file to the URL returned by the applyUpload response.

File upload is a two-step flow: the provider calls applyUpload on the VRM virtual IP (TCP/7443); the DCS platform responds with a URL that points to a specific physical host's management IP and port; the provider then streams the file to that URL. Both destinations must be reachable end-to-end before cluster creation.

If global cluster nodes use a multi-NIC layout (for example, one NIC on the ACP cluster network and another NIC on the customer's management network where DCS is deployed), ensure that both destinations — the VRM virtual IP and every DCS physical host MGMT IP — are routable from the appropriate NIC.

Requirement: Connectivity to both destinations is mandatory for cluster creation and management.

5. LoadBalancer Configuration

Configure a LoadBalancer for the Kubernetes API Server before creating the cluster. The LoadBalancer distributes API server traffic across control plane nodes to ensure high availability.

The DCS provider does not create this load balancer. Provision it yourself before cluster creation and reference its address in DCSCluster.spec.controlPlaneLoadBalancer. For a single-control-plane deployment that has no load balancer in front of the API server, see Single-Control-Plane (No External LB) Layout.

6. Public Registry Configuration

Configure the public registry credentials. This includes:

  • Registry repository address configuration
  • Proper authentication credentials setup

Using the Web UI

WARNING

Fleet Essentials UI does not support ACP 4.3 cluster upgrades

The Fleet Essentials UI workflow has not been adapted to the Cluster Version Operator (CVO) mechanism introduced in ACP 4.3. Do not use the Fleet Essentials UI to upgrade DCS clusters on ACP 4.3.

Two supported alternatives:

Cluster creation and node-pool management through the Fleet Essentials UI are unaffected by this limitation.

Version requirement: This workflow requires Fleet Essentials and Alauda Container Platform DCS Infrastructure Provider 1.0.13 or later. If the provider version is earlier than 1.0.13, use YAML manifests. If you use pool-managed persistent disks, use DCS provider v1.0.16 or later. In v1.0.16, configure DCSIpHostnamePool.spec.pool[].persistentDisk through YAML because the web UI does not expose that field.

If the new cluster will rely on pool-managed persistent disks, create or update the backing DCSIpHostnamePool with YAML and then use the web UI for the rest of the cluster workflow.

Creation Workflow

The cluster creation follows a 5-step wizard:

Step 1: Basic Info

Step 2: Control Plane Node Pool

Step 3: Worker Node Pools

Step 4: Networking

Step 5: Review

Navigation: Clusters → Clusters → Create Cluster → Select Huawei DCS

Step 1: Basic Info

FieldTypeRequiredDescription
Infrastructure CredentialdropdownYesSelect an existing Cloud Credential
NametextYesUnique cluster identifier (lowercase letters, numbers, hyphens)
Display NametextNoCustom description for easy identification
Distribution Versionreadonly-ACP version (matches the global cluster)
Kubernetes Versionreadonly-Determined by Distribution Version
Cluster API AddresstextYesFormat: https://<load-balancer-address>:6443

Prerequisites Check:

Before creating a cluster, ensure:

  • DCS VM Templates exist in the DCS platform, and the MicroOS version matches the Kubernetes version
  • A LoadBalancer for the Kubernetes API Server has been set up

Version Constraint: Only the latest Kubernetes version supported by the platform can be created.

Step 2: Control Plane Node Pool

The control plane node pool is fixed at 3 replicas for high availability.

FieldTypeRequiredDescription
Machine TemplatedropdownYesFilter templates by Type: Control Plane and compatible Kubernetes version
Replicasreadonly-Fixed at 3
SSH Authorized KeystextNoAdd multiple SSH public keys for node access

Validation: The associated IP Pool must have sufficient available IP addresses (≥ 3).

Step 3: Worker Node Pools

You can add multiple worker node pools. Each pool has the following configuration:

FieldTypeRequiredDescription
Pool NametextYesUnique identifier for this node pool
Machine TemplatedropdownYesFilter templates by Type: Worker Node and compatible Kubernetes version
ReplicasnumberYesDefault: 3
Max SurgenumberNoDefault: 0, must be ≥ 0. Keep this value at 0 if the node pool will use pool-managed persistent disks
Max UnavailablenumberNoDefault: 1, must be ≥ 0. When maxSurge = 0, maxUnavailable must be > 0 and ≤ Replicas
SSH Authorized KeystextNoAdd multiple SSH public keys

Validation Rules:

  • Pool names must be unique within the cluster
  • IP Pool must have sufficient available IP addresses (≥ Replicas)
  • maxSurge and maxUnavailable must satisfy the constraint: if maxSurge = 0, then maxUnavailable > 0
  • If the cluster will rely on pool-managed persistent disks, keep maxSurge = 0 so nodes are replaced one by one during future upgrades

Tip: Prefix the pool name with the cluster name followed by a hyphen (e.g., mycluster-worker-1) to avoid naming conflicts across different clusters.

Step 4: Networking

FieldTypeRequiredDescription
Pods CIDRCIDRYesPod network address range
Services CIDRCIDRYesService network address range
Join CIDRCIDRYesKube-OVN join CIDR parameter

Validation: Pods CIDR and Services CIDR must not overlap.

Step 5: Review

Review all configuration settings before creating the cluster:

Basic Info:

  • Name, Display Name, Infrastructure Credential
  • Distribution Version, Kubernetes Version
  • Cluster API Address

Control Plane Node Pool:

  • Machine Template with VM Template Name, OS Version, Kubernetes Version
  • CPU, Memory, Replicas, SSH Keys

Worker Node Pools (list view):

  • Pool Name, Machine Template, Replicas
  • Max Surge, Max Unavailable, SSH Keys

If the cluster will rely on pool-managed persistent disks, keep Max Surge set to 0 for worker node pools.

Networking:

  • Pods CIDR, Services CIDR, Join CIDR

Click Create to start the cluster creation process.


Using YAML

Cluster Creation Workflow

When using YAML, you create Cluster API resources in the global cluster to provision infrastructure and bootstrap a functional Kubernetes cluster.

WARNING

Important Namespace Requirement

To ensure proper integration as business clusters, all resources must be deployed in the cpaas-system namespace. Deploying resources in other namespaces may result in integration issues.

WARNING

Workload Cluster Naming

The workload cluster-name must not be global. That name is reserved for the global cluster, and reusing it causes the workload cluster's resources to collide with global cluster resources in cpaas-system. The global- prefix is reserved for resources owned by the global cluster's DR workflow; see Common Prerequisites. Do not use global- for workload-cluster resources, because failover operations can select those resources as if they belonged to the global cluster.

As a convention, keep the CAPI Cluster and provider cluster resource (DCSCluster) named exactly <cluster-name>, and prefix non-root CAPI and provider resources (KubeadmControlPlane, KubeadmConfigTemplate, MachineDeployment, machine templates, IP/hostname pools, etc.) with <cluster-name>- — for example, the example manifests use <cluster-name>-kcp. This is a recommendation rather than a controller-enforced rule, but it prevents same-namespace collisions when multiple workload clusters live in cpaas-system and makes resource ownership obvious during operations.

Configuration Workflow

Follow these steps in order to provision a functional cluster (control plane and worker nodes):

  1. Configure KubeadmControlPlane (control-plane spec and kubeadm bootstrap).
  2. Configure DCSCluster (infrastructure binding and load balancer reference).
  3. Create the Cluster resource (top-level CAPI object that links the two above).
  4. Configure worker resources: KubeadmConfigTemplate (worker bootstrap), worker DCSMachineTemplate, worker DCSIpHostnamePool, and MachineDeployment. The control plane alone is not a usable cluster. See Managing Nodes on Huawei DCS for the four worker YAML resources.

Note: Infrastructure resources (Secret, control-plane DCSIpHostnamePool, control-plane DCSMachineTemplate) should be configured separately. See Infrastructure Resources for Huawei DCS for instructions.

If you need any disk to survive rolling replacement, declare it in the matching DCSIpHostnamePool.spec.pool[].persistentDisk entry. This includes the platform-required /var/cpaas disk.

Resolving Placeholder Values

The example manifests below use <placeholder> syntax for values that are environment-specific. Several of these have a canonical source of truth that you should query rather than hand-pick:

PlaceholderSource of truthHow to retrieve
<control-plane-kubernetes-version> / <worker-kubernetes-version>A cpaas.io/dcs-vm-template ConfigMap in the cpaas-system namespace, one per distribution version.kubectl -n cpaas-system get cm -l cpaas.io/dcs-vm-template -o yaml — read data.kubernetesVersion.
<dns-image-tag>Same ConfigMap, data.corednsTag.Same kubectl get cm query as above.
<etcd-image-tag>Same ConfigMap, data.etcdTag.Same kubectl get cm query as above.
<vm-template-name> (in DCSMachineTemplate.spec.template.spec.vmTemplateName)The ConfigMap's cpaas.io/dcs-vm-template label value (must match the VM template name registered in the DCS platform).Same kubectl get cm query — read the label.
<base64-encoded-secret> in encryption-provider.confGenerated locally; treat as a real cluster secret.head -c 32 /dev/urandom | base64 — store securely and reuse across control-plane replicas of the same cluster.
<ssh-authorized-keys>Operator-supplied. Each entry is a single-line OpenSSH-format public key (ssh-ed25519 AAAA… / ssh-rsa AAAA…). The field is required by the ignition validator and must be non-empty. For test or PoC clusters that do not need interactive SSH, supply any syntactically valid public key (you do not need to hold the matching private key); for production, use the operator team's signing key.n/a — generated or sourced from the operator.
<auth-secret-name>The Secret you authored in Infrastructure → Cloud Credentials.kubectl -n cpaas-system get secret <name>
<cluster-name>Operator-chosen; must satisfy DNS-1123 and must not be global (that name is reserved for the global cluster). Reused across Cluster, DCSCluster, and the cluster.x-k8s.io/cluster-name label on every Machine. The KubeadmControlPlane uses the prefixed form <cluster-name>-kcp. See Workload Cluster Naming for the full convention.n/a
<load-balancer-ip-or-domain-name>Operator-supplied: the IP / hostname clients use to reach the cluster API server. For single-control-plane clusters with no external LB, this is the IP of the sole master node (see Single-control-plane layout).n/a
<pods-cidr> / <services-cidr> / <kube-ovn-join-cidr>Operator-supplied. Must not overlap with the host network, the global cluster's CIDRs, or any other CAPI cluster's CIDRs you intend to interconnect. Leaving them empty falls back to the kube-ovn defaults shipped with the global cluster, which is not recommended for production: explicit CIDRs avoid silent collisions when multiple workload clusters live on the same global cluster.n/a

Magic-Token Placeholders

A few values in the example manifests look like placeholders but are actually literal tokens substituted by the DCS Provider at machine-join time. Leave them exactly as written:

Literal tokenMeaningSubstituted by
PROVIDER_IDThe per-Machine provider ID (e.g. dcs://<dcsmachine-name>).DCS Provider — overwritten in the generated kubelet config before kubeadm init / kubeadm join runs.
NODE_IPThe node IP allocated from the DCSIpHostnamePool entry attached to this Machine.DCS Provider — overwritten with the value of DCSIpHostnamePool.spec.pool[].ip.

Replacing or quoting these tokens manually breaks the join flow and produces nodes that never register with the control plane.

Network Planning and Load Balancer

Before creating control plane resources, plan the network architecture and deploy a load balancer for high availability.

Requirements:

  • Network segmentation: Plan IP address ranges for control plane nodes
  • Load balancer: Deploy and configure access to the API server
  • API server address: Prepare a stable VIP or load balancer address for the Kubernetes API Server
  • Connectivity: Ensure network connectivity between all components

Configure KubeadmControlPlane

The KubeadmControlPlane resource defines the control plane configuration including Kubernetes version, node specifications, and bootstrap settings.

TIP

Full Configuration Reference

The example below truncates long configuration files for readability. For the complete configuration (including default audit policies, admission controls, and file contents), refer to the Complete KubeadmControlPlane Configuration in the Appendix.

kubeadmcontrolplane.yaml
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  name: <cluster-name>-kcp
  namespace: cpaas-system
  annotations:
    controlplane.cluster.x-k8s.io/skip-coredns: ""
    controlplane.cluster.x-k8s.io/skip-kube-proxy: ""
spec:
  rolloutStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 0 # Required when the cluster relies on pool-managed persistent disks
  kubeadmConfigSpec:
    users:
    - name: boot
      sshAuthorizedKeys:
      - "<ssh-authorized-keys>"
    format: ignition
    files:
    - path: /etc/kubernetes/admission/psa-config.yaml
      owner: "root:root"
      permissions: "0644"
      content: |
        # ... (Admission Configuration Content) ...
    - path: /etc/kubernetes/patches/kubeletconfiguration0+strategic.json
      owner: "root:root"
      permissions: "0644"
      content: |
        {
          "apiVersion": "kubelet.config.k8s.io/v1beta1",
          "kind": "KubeletConfiguration",
          "_comment": "... (Kubelet Configuration Content) ..."
        }
    # ... (other files) ...
    clusterConfiguration:
      imageRepository: cloud.alauda.io/alauda
      dns:
        imageTag: <dns-image-tag>
      etcd:
        local:
          imageTag: <etcd-image-tag>
      # ... (apiServer, controllerManager, scheduler) ...
    initConfiguration:
      patches:
        directory: /etc/kubernetes/patches
      nodeRegistration:
        kubeletExtraArgs:
          node-labels: "kube-ovn/role=master"
          provider-id: PROVIDER_ID
          volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
          protect-kernel-defaults: "true"
    joinConfiguration:
      patches:
        directory: /etc/kubernetes/patches
      nodeRegistration:
        kubeletExtraArgs:
          node-ip: NODE_IP
          node-labels: "kube-ovn/role=master"
          provider-id: PROVIDER_ID
          volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
          protect-kernel-defaults: "true"
  machineTemplate:
    nodeDrainTimeout: 1m
    nodeDeletionTimeout: 5m
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
      kind: DCSMachineTemplate
      name: <cp-dcs-machine-template-name>
  replicas: 3
  version: <control-plane-kubernetes-version>

Parameter Descriptions:

ParameterTypeDescriptionRequired
.spec.kubeadmConfigSpecobjectkubeadm bootstrap provider startup parametersYes
.spec.machineTemplate.infrastructureRefobjectDCSMachineTemplate referenceYes
.spec.replicasintControl plane replica count. Must satisfy 1 ≤ replicas ≤ IP Pool size. Set to 1 for development / PoC single-control-plane deployments (see Single-control-plane layout). Production usually uses 3 for HA.Yes
.spec.versionstringKubernetes version (must match VM template — see Resolving Placeholder Values)Yes

For component versions (e.g., <dns-image-tag>, <etcd-image-tag>), refer to OS Support Matrix.

Configure DCSCluster

DCSCluster is the infrastructure cluster declaration that references the load balancer and DCS platform credentials.

dcscluster.yaml
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DCSCluster
metadata:
  name: "<cluster-name>"
  namespace: cpaas-system
spec:
  controlPlaneLoadBalancer:
    host: <load-balancer-ip-or-domain-name>
    port: 6443
    type: external
  credentialSecretRef:
    name: <auth-secret-name>
  controlPlaneEndpoint:
    host: <load-balancer-ip-or-domain-name>
    port: 6443
  networkType: kube-ovn
  site: <site>

Parameter Descriptions:

ParameterTypeDescriptionRequired
.spec.controlPlaneLoadBalancerobjectControl plane API server exposure methodYes
.spec.controlPlaneLoadBalancer.typestringCurrently only supports "external"Yes
.spec.controlPlaneLoadBalancer.hoststringLoad balancer IP or domain nameYes
.spec.credentialSecretRef.namestringDCS authentication Secret name. The Secret defines whether DCS Provider authenticates as an interface interconnection user (default) or a domain user — see Credential User Types.Yes
.spec.networkTypestringCurrently only supports "kube-ovn"Yes
.spec.sitestringDCS platform site IDYes

Single-Control-Plane (No External LB) Layout

For development, PoC, or any deployment where the control plane has only one replica (KubeadmControlPlane.spec.replicas: 1), you do not have a real load balancer in front of the API server. Two fields nevertheless still require a value:

  • .spec.controlPlaneLoadBalancer.host and .spec.controlPlaneEndpoint.host — set both to the IP of the sole master node (the same IP allocated to the master in the control-plane DCSIpHostnamePool).
  • .spec.controlPlaneLoadBalancer.type — keep as external (the type field has no other supported value).

Concretely:

spec:
  controlPlaneLoadBalancer:
    host: 10.226.82.150     # same IP as the master node from the IP pool
    port: 6443
    type: external
  controlPlaneEndpoint:
    host: 10.226.82.150     # same as above
    port: 6443

This layout has no HA — losing the single master makes the cluster API unreachable until the master is recovered. For production, use replicas: 3 with a real load balancer.

Configure Cluster

The Cluster resource declares the cluster and references the control plane and infrastructure resources.

cluster.yaml
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  annotations:
    capi.cpaas.io/resource-group-version: infrastructure.cluster.x-k8s.io/v1beta1
    capi.cpaas.io/resource-kind: DCSCluster
    cpaas.io/kube-ovn-join-cidr: <kube-ovn-join-cidr>
  labels:
    cluster-type: DCS
  name: <cluster-name>
  namespace: cpaas-system
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - <pods-cidr>
    services:
      cidrBlocks:
      - <services-cidr>
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: <cluster-name>-kcp
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: DCSCluster
    name: <cluster-name>

Parameter Descriptions:

ParameterTypeDescriptionRequired
.spec.clusterNetwork.pods.cidrBlocks[]stringPod CIDR. Optional in the CAPI schema but recommended to set explicitly so multiple CAPI clusters can co-exist without overlap. If unset, kube-ovn falls back to a default that may conflict with another cluster on the same global.Recommended
.spec.clusterNetwork.services.cidrBlocks[]stringService CIDR. Same recommendation as the pod CIDR — set explicitly to avoid collisions across clusters.Recommended
.spec.controlPlaneRefobjectControl plane referenceYes
.spec.infrastructureRefobjectInfrastructure cluster referenceYes

Cluster Annotations:

The example above shows three annotations, but a complete Cluster resource carries a few more. The table below lists the annotations the operator authors (some others are written by ACP controllers and you should not pre-set them):

AnnotationRequiredValue sourcePurpose
capi.cpaas.io/resource-group-versionYesLiteral infrastructure.cluster.x-k8s.io/v1beta1Tells the CAPI infrastructure binding which API group to use.
capi.cpaas.io/resource-kindYesLiteral DCSClusterTells the CAPI infrastructure binding which kind to bind to.
capi.cpaas.io/kubernetesYesSame value as KubeadmControlPlane.spec.version and MachineDeployment.spec.template.spec.version.Display label, also consumed by some upgrade and inventory tooling. Source-of-truth is the cpaas.io/dcs-vm-template ConfigMap's kubernetesVersion.
cpaas.io/kube-ovn-join-cidrYesOperator-chosen /16 CIDR, must not overlap with pod / service CIDRs or any other cluster's join CIDR.Kube-OVN inter-node tunnel network.
cpaas.io/kube-ovn-versionYesThe kube-ovn release the global cluster ships with. Read it from the cpaas.io/kube-ovn-version annotation on the global CAPI Cluster (kubectl get cluster global -n cpaas-system -o jsonpath='{.metadata.annotations.cpaas\.io/kube-ovn-version}'). The same value lives on every healthy workload cluster on the same global cluster, so you can also read it from any of those if they exist.Pin the kube-ovn version installed into the workload cluster.
cpaas.io/registry-addressYesThe image registry the global cluster uses (typically <registry-host>:11443). Read it from the cpaas.io/registry-address annotation on the global CAPI Cluster (kubectl get cluster global -n cpaas-system -o jsonpath='{.metadata.annotations.cpaas\.io/registry-address}').Workload cluster pulls platform images (CoreDNS, kube-proxy, kube-ovn) from this registry.
cpaas.io/nodes-modeYesLiteral self-managed for clusters that DCS Provider provisions.Marks the cluster as having its node lifecycle managed by CAPI + this provider.

ACP controllers may write additional read-only annotations (cpaas.io/cpu-cores-number, cpaas.io/memories, cpaas.io/nodes-number, and so on) after the cluster is up — these are computed and must not be pre-set in the YAML you apply.

Deploying Nodes

Refer to Managing Nodes on Huawei DCS for instructions on deploying worker nodes.


Cluster Verification

After deploying all cluster resources, verify that the cluster has been created successfully and is operational.

Using the Console

  1. Navigate to ClustersClusters
  2. Locate your newly created cluster in the cluster list
  3. Verify that the cluster status shows as Running
  4. Check that all control plane and worker nodes are Ready

Using kubectl

Alternatively, verify the cluster using kubectl commands:

# Check cluster status
kubectl get cluster -n cpaas-system <cluster-name>

# Verify control plane
kubectl get kubeadmcontrolplane -n cpaas-system <cluster-name>-kcp

# Check machine status
kubectl get machines -n cpaas-system

# Verify cluster deployment
kubectl get clustermodule <cluster-name> -o jsonpath='{.status.base.deployStatus}'

Expected Results

A successfully created cluster should show:

  • Cluster status: Running or Provisioned
  • All control plane machines: Running
  • All worker nodes (if deployed): Running
  • Kubernetes nodes: Ready
  • Cluster Module Status: Completed

Appendix

Complete KubeadmControlPlane Configuration

Below is the complete KubeadmControlPlane configuration, including all default audit policies, admission controls, and file contents.

apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
metadata:
  name: <cluster-name>-kcp
  namespace: cpaas-system
  annotations:
    controlplane.cluster.x-k8s.io/skip-coredns: ""
    controlplane.cluster.x-k8s.io/skip-kube-proxy: ""
spec:
  rolloutStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 0 # Required when the cluster relies on pool-managed persistent disks
  kubeadmConfigSpec:
    users:
    - name: boot
      sshAuthorizedKeys:
      - "<ssh-authorized-keys>"
    format: ignition
    files:
    - path: /etc/kubernetes/admission/psa-config.yaml
      owner: "root:root"
      permissions: "0644"
      content: |
        apiVersion: apiserver.config.k8s.io/v1
        kind: AdmissionConfiguration
        plugins:
        - name: PodSecurity
          configuration:
            apiVersion: pod-security.admission.config.k8s.io/v1
            kind: PodSecurityConfiguration
            defaults:
              enforce: "privileged"
              enforce-version: "latest"
              audit: "baseline"
              audit-version: "latest"
              warn: "baseline"
              warn-version: "latest"
            exemptions:
              usernames: []
              runtimeClasses: []
              namespaces:
              - kube-system
              - cpaas-system
    - path: /etc/kubernetes/patches/kubeletconfiguration0+strategic.json
      owner: "root:root"
      permissions: "0644"
      content: |
        {
          "apiVersion": "kubelet.config.k8s.io/v1beta1",
          "kind": "KubeletConfiguration",
          "protectKernelDefaults": true,
          "tlsCertFile": "/etc/kubernetes/pki/kubelet.crt",
          "tlsPrivateKeyFile": "/etc/kubernetes/pki/kubelet.key",
          "streamingConnectionIdleTimeout": "5m",
          "clientCAFile": "/etc/kubernetes/pki/ca.crt"
        }
    - path: /etc/kubernetes/encryption-provider.conf
      owner: "root:root"
      append: false
      permissions: "0644"
      content: |
        apiVersion: apiserver.config.k8s.io/v1
        kind: EncryptionConfiguration
        resources:
        - resources:
          - secrets
          providers:
          - aescbc:
              keys:
              - name: key1
                secret: <base64-encoded-secret>
    - path: /etc/kubernetes/audit/policy.yaml
      owner: "root:root"
      append: false
      permissions: "0644"
      content: |
        apiVersion: audit.k8s.io/v1
        kind: Policy
        omitStages:
        - "RequestReceived"
        rules:
        - level: None
          users:
          - system:kube-controller-manager
          - system:kube-scheduler
          - system:serviceaccount:kube-system:endpoint-controller
          verbs: ["get", "update"]
          namespaces: ["kube-system"]
          resources:
          - group: ""
            resources: ["endpoints"]
        - level: None
          nonResourceURLs:
          - /healthz*
          - /version
          - /swagger*
        - level: None
          resources:
          - group: ""
            resources: ["events"]
        - level: None
          resources:
          - group: "devops.alauda.io"
        - level: None
          verbs: ["get", "list", "watch"]
        - level: None
          resources:
          - group: "coordination.k8s.io"
            resources: ["leases"]
        - level: None
          resources:
          - group: "authorization.k8s.io"
            resources: ["subjectaccessreviews", "selfsubjectaccessreviews"]
          - group: "authentication.k8s.io"
            resources: ["tokenreviews"]
        - level: None
          resources:
          - group: "app.alauda.io"
            resources: ["imagewhitelists"]
          - group: "k8s.io"
            resources: ["namespaceoverviews"]
        - level: Metadata
          resources:
          - group: ""
            resources: ["secrets", "configmaps"]
        - level: Metadata
          resources:
          - group: "operator.connectors.alauda.io"
            resources: ["installmanifests"]
          - group: "operators.katanomi.dev"
            resources: ["katanomis"]
        - level: RequestResponse
          resources:
          - group: ""
          - group: "aiops.alauda.io"
          - group: "apps"
          - group: "app.k8s.io"
          - group: "authentication.istio.io"
          - group: "auth.alauda.io"
          - group: "autoscaling"
          - group: "asm.alauda.io"
          - group: "clusterregistry.k8s.io"
          - group: "crd.alauda.io"
          - group: "infrastructure.alauda.io"
          - group: "monitoring.coreos.com"
          - group: "operators.coreos.com"
          - group: "networking.istio.io"
          - group: "extensions.istio.io"
          - group: "install.istio.io"
          - group: "security.istio.io"
          - group: "telemetry.istio.io"
          - group: "opentelemetry.io"
          - group: "networking.k8s.io"
          - group: "portal.alauda.io"
          - group: "rbac.authorization.k8s.io"
          - group: "storage.k8s.io"
          - group: "tke.cloud.tencent.com"
          - group: "devopsx.alauda.io"
          - group: "core.katanomi.dev"
          - group: "deliveries.katanomi.dev"
          - group: "integrations.katanomi.dev"
          - group: "artifacts.katanomi.dev"
          - group: "builds.katanomi.dev"
          - group: "versioning.katanomi.dev"
          - group: "sources.katanomi.dev"
          - group: "tekton.dev"
          - group: "operator.tekton.dev"
          - group: "eventing.knative.dev"
          - group: "flows.knative.dev"
          - group: "messaging.knative.dev"
          - group: "operator.knative.dev"
          - group: "sources.knative.dev"
          - group: "operator.devops.alauda.io"
          - group: "flagger.app"
          - group: "jaegertracing.io"
          - group: "velero.io"
            resources: ["deletebackuprequests"]
          - group: "connectors.alauda.io"
          - group: "operator.connectors.alauda.io"
            resources: ["connectorscores", "connectorsgits", "connectorsocis"]
        - level: Metadata
    preKubeadmCommands:
    - while ! ip route | grep -q "default via"; do sleep 1; done; echo "NetworkManager started"
    - mkdir -p /run/cluster-api && restorecon -Rv /run/cluster-api
    - if [ -f /etc/disk-setup.sh ]; then bash /etc/disk-setup.sh; fi
    postKubeadmCommands:
    - chmod 600 /var/lib/kubelet/config.yaml
    clusterConfiguration:
      imageRepository: cloud.alauda.io/alauda
      dns:
        imageTag: <dns-image-tag>
      etcd:
        local:
          imageTag: <etcd-image-tag>
      apiServer:
        extraArgs:
          audit-log-format: json
          audit-log-maxage: "30"
          audit-log-maxbackup: "10"
          audit-log-maxsize: "200"
          profiling: "false"
          audit-log-mode: batch
          audit-log-path: /etc/kubernetes/audit/audit.log
          audit-policy-file: /etc/kubernetes/audit/policy.yaml
          tls-cipher-suites: "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"
          encryption-provider-config: /etc/kubernetes/encryption-provider.conf
          admission-control-config-file: /etc/kubernetes/admission/psa-config.yaml
          tls-min-version: VersionTLS12
          kubelet-certificate-authority: /etc/kubernetes/pki/ca.crt
        extraVolumes:
        - name: vol-dir-0
          hostPath: /etc/kubernetes
          mountPath: /etc/kubernetes
          pathType: Directory
      controllerManager:
        extraArgs:
          bind-address: "::"
          profiling: "false"
          tls-min-version: VersionTLS12
          flex-volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
      scheduler:
        extraArgs:
          bind-address: "::"
          tls-min-version: VersionTLS12
          profiling: "false"
    initConfiguration:
      patches:
        directory: /etc/kubernetes/patches
      nodeRegistration:
        kubeletExtraArgs:
          node-labels: "kube-ovn/role=master"
          provider-id: PROVIDER_ID
          volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
          protect-kernel-defaults: "true"
    joinConfiguration:
      patches:
        directory: /etc/kubernetes/patches
      nodeRegistration:
        kubeletExtraArgs:
          node-ip: NODE_IP
          node-labels: "kube-ovn/role=master"
          provider-id: PROVIDER_ID
          volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
          protect-kernel-defaults: "true"
  machineTemplate:
    nodeDrainTimeout: 1m
    nodeDeletionTimeout: 5m
    infrastructureRef:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
      kind: DCSMachineTemplate
      name: <cp-dcs-machine-template-name>
  replicas: 3
  version: <control-plane-kubernetes-version>
TIP

Alternative: reference a centrally managed Secret instead of inline content

The Alauda Container Platform DCS Infrastructure Provider plugin ships a Secret named dcs-kubernetes-<kubernetes-major-minor>-files in the cpaas-system namespace (for example, dcs-kubernetes-1.33-files for Kubernetes 1.33). It contains the canonical content of psa-config.yaml, control-plane-kubelet-patch.json, and audit-policy.yaml, and is updated together with each release.

When that Secret is present, you can replace the three inline files entries with contentFrom.secret references. Inline and Secret-referenced forms are functionally equivalent; using the Secret keeps file content aligned with the installed plugin version and avoids manual updates on cluster upgrades.

files:
- contentFrom:
    secret:
      key: psa-config.yaml
      name: dcs-kubernetes-1.33-files
  owner: "root:root"
  path: /etc/kubernetes/admission/psa-config.yaml
  permissions: "0644"
- contentFrom:
    secret:
      key: control-plane-kubelet-patch.json
      name: dcs-kubernetes-1.33-files
  owner: "root:root"
  path: /etc/kubernetes/patches/kubeletconfiguration0+strategic.json
  permissions: "0644"
- contentFrom:
    secret:
      key: audit-policy.yaml
      name: dcs-kubernetes-1.33-files
  owner: "root:root"
  path: /etc/kubernetes/audit/policy.yaml
  permissions: "0644"
- path: /etc/kubernetes/encryption-provider.conf
  owner: "root:root"
  append: false
  permissions: "0644"
  content: |
    apiVersion: apiserver.config.k8s.io/v1
    kind: EncryptionConfiguration
    resources:
    - resources:
      - secrets
      providers:
      - aescbc:
          keys:
          - name: key1
            secret: <base64-encoded-secret>

encryption-provider.conf is not provided by the Secret. You can either keep it inline as shown above (and supply your own <base64-encoded-secret>), or omit the inline file entirely and rely on the version that the DCS VM template image already bakes in — both are valid; the latter is simpler when the VM template's default key is acceptable for your environment.

Minimum plugin version: This Secret is shipped by the DCS Provider plugin starting from v1.0.13. On older plugin versions the Secret does not exist; keep the inline content: form in that case. To check whether the Secret is present on the target cluster before deciding which form to use:

# Replace <kubernetes-major-minor> with the value matching this cluster
# (for example, 1.33 for Kubernetes v1.33.x).
K8S_MM=<kubernetes-major-minor>
kubectl -n cpaas-system get secret "dcs-kubernetes-${K8S_MM}-files" >/dev/null 2>&1 \
  && echo "Secret present — contentFrom.secret form is supported" \
  || echo "Secret missing — use inline content form"

Next Steps

After creating a cluster:

Troubleshooting

If the cluster reaches Provisioned but never becomes Ready — for example, workload nodes stay NotReady because the CNI is not deployed — see Troubleshoot a Workload Cluster Stuck in Provisioned.