Problem:

A telecommunications customer operating a production Kubernetes cluster deployed via Kubespray encountered an infrastructure challenge. One of their original control-plane nodes (kz-bss-k8om01) had previously failed and was replaced with a new node named kz-bss-k8om04. Later, the client requested to rename this node back to its original FQDN (kz-bss-k8om01) and ideally retain the original IP address as well. This change was essential for the consistency of their monitoring tools and automation workflows, many of which depended on node naming conventions.

However, Kubernetes heavily relies on the node’s hostname and IP for identity, TLS certificates, and etcd membership. Renaming a node post-deployment — especially a control-plane node — is highly risky and not officially supported, as it could jeopardize cluster stability.

Process:

Step 1: Initial Analysis

Our expert began by assessing the Kubernetes cluster setup and the risks associated with renaming a node already part of the control plane. It was determined that none of the existing community workarounds provided a safe, supportable method for renaming a control-plane node in place. These attempts posed a high risk of breaking etcd quorum, invalidating TLS certificates, or corrupting node registration.

Step 2: Infrastructure Review

The client shared details of the networking setup: the cluster uses an external load balancer (Radware Alteon), /etc/hosts is managed manually, and DNS resolution is handled by an internal DNS server. Based on this, the expert confirmed that the node’s removal and re-addition would be feasible as long as the remaining two control-plane nodes stayed operational during the process.

Step 3: Strategic Recommendation

The expert recommended a two-phase process:

  • Gracefully remove kz-bss-k8om04 from the cluster using Kubespray’s remove-node.yml playbook.
  • Update the inventory with the desired hostname and IP (kz-bss-k8om01, 10.172.224.22).
  • Re-add the node as a new control-plane member using Kubespray’s cluster.yml playbook.

This approach would ensure clean configuration and avoid direct renaming, which is not supported.

Step 4: Validating the Plan with the Client

The client expressed concerns about IP reuse. The expert confirmed that once the old node is removed and the configuration is cleared, the same IP can be reused during rejoining. The client also asked about downtime and recommended maintenance window duration. The expert confirmed that while no downtime was expected, a maintenance window of 4–6 hours would be prudent.

Step 5: Defining Safeguards

To mitigate risks, the expert advised:

  • Backing up etcd manually using etcdctl snapshot.
  • Ensuring the load balancer marks the node being removed as offline before changes begin.
  • Verifying cluster health using kubectl after node replacement.

Solution:

By following a replace-not-rename strategy, the expert enabled the client to safely reintroduce the control-plane node under its original hostname and IP. Kubespray ensured that the removal and re-addition processes were automated, auditable, and consistent with best practices. This method also preserved etcd stability and control-plane quorum.

The node was successfully removed, renamed, and re-added — with no downtime or service disruption — and the cluster continued to operate as expected.

Conclusion:

Renaming a control-plane node in Kubernetes is inherently risky and not officially supported. Attempting this manually often leads to severe consequences in production clusters. However, by removing and re-adding the node cleanly using Kubespray, the expert helped the client achieve the desired outcome without compromising cluster integrity.