Updating Node CIDR for existing RKE1 k8s cluster

Purushotham Reddy
10 min readJun 25, 2024

--

Introduction

In this blog we are going to discuss about updating Node CIDR for existing RKE1 clusters.

But why do we need do that? Let me explain you one scenario where we need to do this.

Scenario

Well when RKE1 k8s cluster is provisioned initially it uses default configuration values for most of the settings. One of the default configuration setting is ClusterCIDR.

By Default ClusterCIDR chosen is 10.42.0.0/16 out of which /24 Cidr will be assigned for NodeCIDR(for ex: 10.42.0.0/24) which means each node gets 256 ipaddress. So each node can only accommodate up to 256 pods. This limit is very very low(ok for dev and sandbox) but not an ideal value for production environments.

Pre-Requisites

  • RKE1 cluster
  • rook ceph storage solution(optional)
  • kubectl

Inspecting Existing RKE1 cluster

Lets verify the existing rke1 cluster.

$ kubectl get nodes

NAME STATUS ROLES AGE VERSION
rke-controlplane-1 Ready controlplane,etcd,worker 2m43s v1.27.14
rke-controlplane-2 Ready controlplane,etcd,worker 2m43s v1.27.14
rke-controlplane-3 Ready controlplane,etcd,worker 2m43s v1.27.14
rke-worker-1 Ready worker 2m40s v1.27.14
rke-worker-2 Ready worker 2m40s v1.27.14
rke-worker-3 Ready worker 2m40s v1.27.14

As shown in the above output we have 6 node k8s cluster(3 are controlplane and 3 are worker nodes). cluster.yml used to set up the above k8s cluster is here.

Lets describe each node and look for node cidr.

$ kubectl describe node rke-controlplane-1 | grep -i podcidr

PodCIDR: 10.42.0.0/20
PodCIDRs: 10.42.0.0/20

kubectl describe node rke-controlplane-2 | grep -i podcidr

PodCIDR: 10.42.32.0/20
PodCIDRs: 10.42.32.0/20

kubectl describe node rke-controlplane-3 | grep -i podcidr

PodCIDR: 10.42.16.0/20
PodCIDRs: 10.42.16.0/20

$ kubectl describe node rke-worker-1 | grep -i podcidr

PodCIDR: 10.42.48.0/20
PodCIDRs: 10.42.48.0/20

$ kubectl describe node rke-worker-2 | grep -i podcidr

PodCIDR: 10.42.64.0/20
PodCIDRs: 10.42.64.0/20

$ kubectl describe node rke-worker-3 | grep -i podcidr

PodCIDR: 10.42.80.0/20
PodCIDRs: 10.42.80.0/20

As shown in the above output each node has /20 CIDR(in my case it is already updated to /20, but you will see /24 if you have created new RKE1 cluster).

I am using rook ceph storage solution(mentioned in pre-requisite- which is optional for this blog) for storage.

$ kubectl get pods -n rook-ceph

NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-4kw84 2/2 Running 0 16m
csi-cephfsplugin-bqvlg 2/2 Running 0 16m
csi-cephfsplugin-bxwnk 2/2 Running 0 16m
csi-cephfsplugin-dccs5 2/2 Running 0 16m
csi-cephfsplugin-kkcb7 2/2 Running 0 16m
csi-cephfsplugin-provisioner-65c6bccc4-dfflv 5/5 Running 0 16m
csi-cephfsplugin-provisioner-65c6bccc4-kbsk7 5/5 Running 0 16m
csi-cephfsplugin-wnj8q 2/2 Running 0 16m
csi-rbdplugin-8rjwz 2/2 Running 0 16m
csi-rbdplugin-dgdhv 2/2 Running 0 16m
csi-rbdplugin-gxbg7 2/2 Running 0 16m
csi-rbdplugin-jqrcf 2/2 Running 0 16m
csi-rbdplugin-provisioner-596ffd547f-dfw59 5/5 Running 0 16m
csi-rbdplugin-provisioner-596ffd547f-tgmwp 5/5 Running 0 16m
csi-rbdplugin-qw7sl 2/2 Running 0 16m
csi-rbdplugin-r2lfp 2/2 Running 0 16m
rook-ceph-crashcollector-rke-controlplane-1-6bd867bcd-6mskc 1/1 Running 0 15m
rook-ceph-crashcollector-rke-controlplane-2-55c868489c-n2xjb 1/1 Running 0 14m
rook-ceph-crashcollector-rke-controlplane-3-5fbf978988-bhl5n 1/1 Running 0 15m
rook-ceph-crashcollector-rke-worker-1-5645f8b599-fvnt5 1/1 Running 0 14m
rook-ceph-crashcollector-rke-worker-2-5975ddfbb8-ftgt2 1/1 Running 0 15m
rook-ceph-crashcollector-rke-worker-3-56464f59bc-8j4p5 1/1 Running 0 15m
rook-ceph-exporter-rke-controlplane-1-76c86fbf5f-66rxb 1/1 Running 0 15m
rook-ceph-exporter-rke-controlplane-2-65fbf7fc6-qz4sn 1/1 Running 0 14m
rook-ceph-exporter-rke-controlplane-3-5477b5bcd9-r4gt2 1/1 Running 0 15m
rook-ceph-exporter-rke-worker-1-6fc6db6bd6-7vdjw 1/1 Running 0 14m
rook-ceph-exporter-rke-worker-2-9d58c969-kkh7n 1/1 Running 0 15m
rook-ceph-exporter-rke-worker-3-6bcfd87758-dkz8c 1/1 Running 0 15m
rook-ceph-mds-myfs-a-b85499cbd-xjxth 2/2 Running 0 14m
rook-ceph-mds-myfs-b-5c5c8cdd46-hx8dz 2/2 Running 0 14m
rook-ceph-mgr-a-598bbb9954-7wqgp 3/3 Running 0 15m
rook-ceph-mgr-b-9f696f74c-9n94w 3/3 Running 0 15m
rook-ceph-mon-a-db957cbb5-cndxc 2/2 Running 0 16m
rook-ceph-mon-b-5b8dcd46f7-dbxzc 2/2 Running 0 16m
rook-ceph-mon-c-6547c779d6-gpsfw 2/2 Running 0 16m
rook-ceph-operator-7944d99f56-pbwzj 1/1 Running 0 18m
rook-ceph-osd-0-8594868cdd-4m4rc 2/2 Running 0 15m
rook-ceph-osd-1-574cd9949-vt297 2/2 Running 0 15m
rook-ceph-osd-2-76d95578c9-jtbj4 2/2 Running 0 15m
rook-ceph-osd-3-5d9b6d5f66-d7qjk 2/2 Running 0 15m
rook-ceph-osd-4-66dbf7cd99-c6qjr 2/2 Running 0 15m
rook-ceph-osd-5-55978d7548-9295c 2/2 Running 0 15m
rook-ceph-osd-prepare-rke-controlplane-1-54sts 0/1 Completed 0 14m
rook-ceph-osd-prepare-rke-controlplane-2-mz4mh 0/1 Completed 0 14m
rook-ceph-osd-prepare-rke-controlplane-3-2q7q9 0/1 Completed 0 14m
rook-ceph-osd-prepare-rke-worker-1-v7qx5 0/1 Completed 0 14m
rook-ceph-osd-prepare-rke-worker-2-7cpmk 0/1 Completed 0 14m
rook-ceph-osd-prepare-rke-worker-3-rs9j9 0/1 Completed 0 14m

Now Lets Deploy some statefulset pods to existing cluster. we are deploying these to ensure/verify that these pods come back with up(with all data) even after our cidr update. Execute the below commands.

git clone https://github.com/purushothamkdr453/rke1-learning.git

kubectl apply -f ./rke1-learning/cidr-update/sql-statefulset/sql-pass.yaml

kubectl apply -f ./rke1-learning/cidr-update/sql-statefulset/sts.yaml

list the statefulset pods.

$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
mysql-set-0 1/1 Running 0 18m 10.42.64.17 rke-worker-2 <none> <none>
mysql-set-1 1/1 Running 0 13m 10.42.80.10 rke-worker-3 <none> <none>
mysql-set-2 1/1 Running 0 13m 10.42.16.11 rke-controlplane-3 <none> <none>
mysql-set-3 1/1 Running 0 13m 10.42.48.19 rke-worker-1 <none> <none>
mysql-set-4 1/1 Running 0 13m 10.42.32.14 rke-controlplane-2 <none> <none>
mysql-set-5 1/1 Running 0 12m 10.42.0.12 rke-controlplane-1 <none> <none>

Exec into any one of the pod(for example mysql-set-0 which is scheduled on rke-worker-2 node) pod and run the below commands(refer to commented sections in below output for better understanding).

$ kubectl exec -it mysql-set-0 bash

kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.

# Connect to Mysql

bash-4.2# mysql -u root -p # Enter password as- password
Enter password:

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.7.44 MySQL Community Server (GPL)

Copyright (c) 2000, 2023, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

# Creating new database

mysql> create database employees; # Creating new database employees
Query OK, 1 row affected (0.00 sec)

# Connecting to database

mysql> use employees; # Connect to employees database
Database changed
mysql> CREATE TABLE Persons (
-> PersonID int,
-> LastName varchar(255),
-> FirstName varchar(255),
-> Address varchar(255),
-> City varchar(255)
-> ); # Creating New table Named Persons
Query OK, 0 rows affected (0.14 sec)

# Inserting content into table

mysql> INSERT INTO Persons (PersonID, LastName, FirstName, Address, City) VALUES ('100','reddy','purushotham','Mallapur','Hyderabad');
Query OK, 1 row affected (0.05 sec) # Inserting content into table

# Printing table content

mysql> select * from Persons;
+----------+----------+-------------+----------+-----------+
| PersonID | LastName | FirstName | Address | City |
+----------+----------+-------------+----------+-----------+
| 100 | reddy | purushotham | Mallapur | Hyderabad |
+----------+----------+-------------+----------+-----------+
1 row in set (0.00 sec)

Now repeat the same process for other replicas(statefulset pods) and insert some different content into table.

CIDR Update Process

Changing NODE CIDR for Existing RKE1 cluster involves 5 steps.

  • Adding/Updating node-cidr-mask-size attribute inside cluster.yml
  • adding SubnetLen attribute to canal configmap and restarting canal and kube-controller pods
  • Drain the node for which you want to update the CIDR
  • Take out the node from cluster i.e Comment the drained node configuration inside cluster.yml and run RKE UP
  • Add the node back to cluster i.e Uncomment the drained node(refer to step-3) configuration inside cluster.yml and run RKE UP

Repeat Step-3, Step-4, Step-5 for all nodes which are part of the cluster. we will in detail about each step separately.

Step-1: Adding/Updating node-cidr-mask-size attribute inside cluster.yml

add the below configuration to existing cluster.yml file. Refer to the complete configuration cluster.yml file here.


services:
kube-controller:
cluster_cidr: 10.42.0.0/16
extra_args:
# for existing cluster we already have this value as 20 so changing to 22
node-cidr-mask-size: '22'
kubelet:
extra_args:
max-pods: 4000

After updating the above change execute below command.


rke up

updated node-cidr-mask-size will be reflected under args section of kube-controller-manager container. This can be checked using below command.

docker inspect kube-controller-manager --format json | jq '.[].Args'

Step-2: adding SubnetLen attribute to canal configmap and restarting canal and kube-controller pods

Edit canal configmap and add SubnetLen attribute with value of 22(cidr) under net-conf.json key. Execute the below commands.


kubectl -n kube-system edit configmap canal-config

net-conf.json: |
{
"Network": "10.42.0.0/16",
"SubnetLen": 22,
"Backend": {
"Type": "vxlan"
}
}

Then restart canal daemonset & kube-controller deployment. Execute below commands.


# Restarting Canal Daemonset

kubectl -n kube-system rollout restart daemonset/canal

# Restarting kube controller deployment

kubectl -n kube-system rollout restart deployment/calico-kube-controllers

Makesure all the canal daemon sets pods are restarted and running before proceeding further.

Step-3: Drain the node for which you want to update the CIDR

I am going to drain node named “rke-worker-2”. You can choose any node the reason why I chose this node is — thats where mysql-set-0 pod is scheduled on which we created database, table and inserted some content(refer to previous sections). Execute the below command to drain the node.


kubectl drain rke-worker-2 --ignore-daemonsets --delete-emptydir-data=true

wait until the above command execution is successful.

Now list the pods.


kubectl get pods -o wide

As you can notice from the above screenshot mysql-set-0 pod is now scheduled on rke-worker-1 node(moved out from rke-worker-2 node). Content must still be there. you can verify it by exec into the pod and by shooting sql queries.

Step-4: Comment the drained node configuration inside cluster.yml

Open the cluster.yml file(which is used for provisioning RKE1 cluster) and comment the drained node configuration(in this case it is rke-worker-2).

After commenting, it should look something like this.

Now execute RKE UP command.


rke up

Makesure the above command execution is successful.

Now if you list the nodes you should see only 5 nodes(part of cluster).

Step-5: UnComment the drained node configuration inside cluster.yml

Now lets readd the drained node (refer to step-3)back to the cluster by uncommenting its configuration inside cluster.yml.

After executing run RKE UP command.


rke up

Makesure the above command execution is successful.

Now if you list the nodes you should see 6 nodes this time. Previously drained/commented node(rke-worker-2) is added back to the cluster.

Now lets describe the node and look for podcidr.


kubectl describe node rke-worker-2 | grep -i PodCIDRs

As you can notice in the above screenshot cidr is updated to /22.

We have updated cidr only for one node i.e rke-worker-2. This process has to repeated for all the other nodes. So Basically Step-3, Step-4 & step-5 has to be repeated for other nodes of the cluster.

Now you can exec into mysql-set-0 pod and look for the previously inserted content to makesure content is not removed while upgrading cidr.

Imp things to consider:

  1. when you drain the node(step-3) you may see a warning messages sometimes i.e pod is violated by pdbs. so adjust pdbs so that drain can be progressed.
  2. Make sure all the k8s components are up & running(coredns, canal pods, apiserver etc etc).
  3. if you are using rook makesure rook operator pod is up & running. Makesure rook operator pod has connectivity to mon pods.

By following the above steps we can easily update the CIDR for existing RKE1 cluster. Try out the above steps and let me know if you run into any issues. Feel free to comment if you have any questions/queries.

--

--