This hands-on tutorial walks you through building a high-availability (HA) Kubernetes cluster using K3s on Ubuntu Server 24.04. We’ll use three VMs (control-plane + embedded etcd) so the control plane is fault-tolerant and production-style — perfect for homelab and career practice.
You’ll do this:
- Create VM #1 and apply a clean Ubuntu baseline (swap off, NTP, firewall, kernel modules).
- (Important when cloning) Regenerate machine-id so clones get unique identity.
- Turn VM #1 into cp1 (cluster init).
- Join cp2 and cp3 as additional control-plane nodes.
- Verify nodes and etcd membership.
We’re using K3s because it’s upstream Kubernetes with a lighter footprint and simple HA (embedded etcd). Everything here works great on Proxmox VEs, ESXi, or any hypervisor.
Topology & IP Plan
| Node | Hostname | Role | IP |
|---|---|---|---|
| VM #1 | k3s-cp1 | control-plane + etcd (init) | 10.11.12.11 |
| VM #2 | k3s-cp2 | control-plane + etcd (join) | 10.11.12.12 |
| VM #3 | k3s-cp3 | control-plane + etcd (join) | 10.11.12.13 |
Adjust addresses for your LAN. Use static IPs.
0) Create VM #1 and Install Ubuntu 24.04
In your hypervisor (e.g., Proxmox), create a VM with:
- 2 vCPU (min), 4–8 GB RAM (more is nicer), 20+ GB disk
- VirtIO SCSI disk, VirtIO NIC (or default)
- Enable the QEMU Guest Agent
Install Ubuntu Server 24.04 LTS (minimal is fine). After first boot, SSH in.
1) Base OS Baseline (Do this on each control-plane VM — cp1, cp2, cp3)
Start on VM #1 now; you’ll repeat later on VM #2 and VM #3.
sudo -s
# Keep the OS fresh
apt update && apt -y full-upgrade
# Handy & needed packages
apt -y install qemu-guest-agent curl ufw htop ca-certificates gnupg
systemctl enable --now qemu-guest-agent
# Disable swap (Kubernetes requirement)
swapoff -a
sed -i '/ swap / s/^/#/' /etc/fstab
free -h # Swap should now be 0
# Enable NTP (TLS and etcd hate clock skew)
timedatectl set-ntp true
timedatectl status
# Kernel modules K8s expects
modprobe br_netfilter overlay
tee /etc/modules-load.d/kubernetes.conf >/dev/null <<'EOF'
br_netfilter
overlay
EOF
# Netfilter settings
tee /etc/sysctl.d/99-kubernetes.conf >/dev/null <<'EOF'
net.bridge.bridge-nf-call-iptables=1
net.ipv4.ip_forward=1
EOF
sysctl --system
UFW (Firewall) — Open the right ports between the nodes
Replace CIDR with your LAN if needed.
ufw allow OpenSSH
ufw allow from 10.11.12.0/24 to any port 6443 proto tcp # Kubernetes API
ufw allow from 10.11.12.0/24 to any port 2379:2380 proto tcp # etcd peer/client
ufw allow from 10.11.12.0/24 to any port 10250 proto tcp # kubelet
ufw allow 8472/udp # flannel VXLAN (default CNI in K3s)
ufw --force enable
ufw status
2) (Cloning Tip) Regenerate machine-id Before You Clone
If you plan to clone VM #1 to create VM #2 and VM #3, you must give each clone a unique machine ID. You can do it like this on the clone(s) before first boot, or right after boot:
# Regenerate machine-id (run as root)
systemctl stop systemd-machine-id-commit.service 2>/dev/null || true
rm -f /etc/machine-id /var/lib/dbus/machine-id
systemd-machine-id-setup
ln -sf /etc/machine-id /var/lib/dbus/machine-id
cat /etc/machine-id
That’s all you need — it prevents “duplicate node name/identity” problems later.
(If you prefer reading more, checkman machine-idon Ubuntu.)
Now set each VM’s hostname and IP:
# Example on VM #1
hostnamectl set-hostname k3s-cp1
# On VM #2 use k3s-cp2, on VM #3 use k3s-cp3
# Ensure /etc/hosts has proper lines:
# (edit to reflect your network)
tee -a /etc/hosts >/dev/null <<'EOF'
10.11.12.11 k3s-cp1
10.11.12.12 k3s-cp2
10.11.12.13 k3s-cp3
EOF
Reboot if you changed hostnames/networking:
reboot
3) Configure cp1 (Cluster Init)
On k3s-cp1 (10.11.12.11):
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server --cluster-init" sh -
systemctl status k3s --no-pager
Check node state:
k3s kubectl get nodes -o wide
You should see k3s-cp1 as Ready.
Export kubeconfig for your user (optional convenience):
mkdir -p ~/.kube
sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
sudo chown $USER:$USER ~/.kube/config
# Replace localhost with LAN IP so you can kubectl from your workstation if needed
sed -i 's/127\.0\.0\.1/10.11.12.11/' ~/.kube/config
kubectl get pods -A
Get the full join token (include the CA hash)
sudo cat /var/lib/rancher/k3s/server/node-token
Copy the entire string. It contains two
::separators — don’t truncate it.
4) Prepare cp2 and cp3 (Baseline + Hostname + Static IP)
Repeat Section 1 (baseline) on k3s-cp2 and k3s-cp3:
- Packages, swap off, NTP on, kernel modules/sysctl, UFW rules
- Set hostname (
k3s-cp2,k3s-cp3) - Confirm IPs (
10.11.12.12,10.11.12.13) - If cloned from cp1, regenerate machine-id (Section 2)
Sanity checks:
hostname
ip -4 a
free -h # swap 0
timedatectl # NTP synchronized: yes
ping -c2 10.11.12.11
curl -k https://10.11.12.11:6443/version # should return JSON with gitVersion
5) Join cp2 as an HA Control-Plane Node
On k3s-cp2 (10.11.12.12):
export K3S_URL="https://10.11.12.11:6443"
export K3S_TOKEN="<PASTE_FULL_TOKEN_FROM_CP1>"
curl -sfL https://get.k3s.io | \
INSTALL_K3S_EXEC="server --server ${K3S_URL} --node-name k3s-cp2" sh -
systemctl status k3s --no-pager
Verify from cp1:
k3s kubectl get nodes -o wide
Expected: k3s-cp1 and k3s-cp2 both Ready (roles include control-plane,etcd).
6) Join cp3 as an HA Control-Plane Node
On k3s-cp3 (10.11.12.13):
export K3S_URL="https://10.11.12.11:6443"
export K3S_TOKEN="<PASTE_FULL_TOKEN_FROM_CP1>"
curl -sfL https://get.k3s.io | \
INSTALL_K3S_EXEC="server --server ${K3S_URL} --node-name k3s-cp3" sh -
systemctl status k3s --no-pager
Verify from cp1:
k3s kubectl get nodes -o wide
Expected: all three nodes Ready with roles control-plane,etcd.
7) Deep Verification
a) Nodes
k3s kubectl get nodes -o wide
Should list:
NAME STATUS ROLES VERSION INTERNAL-IP
k3s-cp1 Ready control-plane,etcd,master v1.33.x+k3s1 10.11.12.11
k3s-cp2 Ready control-plane,etcd,master v1.33.x+k3s1 10.11.12.12
k3s-cp3 Ready control-plane,etcd,master v1.33.x+k3s1 10.11.12.13
b) Etcd members (from cp1)
k3s etcdctl --write-out=table member list
You should see 3 members.
c) System pods
k3s kubectl get pods -n kube-system -o wide
All core components should be Running (coredns, local-path-provisioner, flannel, metrics-server, etc.)
8) Common Join Errors & Quick Fixes
- “not authorized” / CA not trusted
- You probably used an incomplete token. Re-copy from cp1 and ensure it includes the
::CA hash.
- You probably used an incomplete token. Re-copy from cp1 and ensure it includes the
- “duplicate node name found”
- The clone kept old K3s or hostname/machine-id.
Fix on the joining node:/usr/local/bin/k3s-uninstall.sh 2>/dev/null || true rm -rf /var/lib/rancher/k3s /etc/rancher/k3s hostnamectl set-hostname <unique-name> # (If cloned) regenerate machine-id as shown earlierThen re-join with--node-name <unique-name>.
- The clone kept old K3s or hostname/machine-id.
- API not reachable
- Open UFW ports (6443, 2379–2380, 10250, 8472/udp), test with:
curl -k https://10.11.12.11:6443/version
- Open UFW ports (6443, 2379–2380, 10250, 8472/udp), test with:
- TLS/etcd weirdness
- Ensure time is in sync on all nodes:
timedatectl status→ NTP synchronized: yes
- Ensure time is in sync on all nodes:
- Swap enabled
- Must be off on every node (
free -h).
- Must be off on every node (
To see errors quickly:
# on the failing node
journalctl -u k3s -n 200 --no-pager
9) (Optional) Quality-of-Life Installs Next
- MetalLB → Allocate EXTERNAL-IPs on your LAN for LoadBalancer services.
- Longhorn → Simple, powerful persistent storage with Web UI.
- Kubernetes Dashboard or Rancher → Full Web UIs for your cluster.
- cert-manager + ExternalDNS → Automatic HTTPS + DNS.
- Argo CD → GitOps.
- kube-prometheus-stack → Prometheus, Alertmanager, Grafana, Node Exporter.
If you want, I can extend this post with a clean “Part 2” covering MetalLB + Longhorn + Dashboard/Rancher setup.
10) Summary
You now have a production-style HA Kubernetes control plane running K3s + embedded etcd across three Ubuntu 24.04 VMs:
- Resilient control plane (3 etcd members)
- Clean baselines (swap off, NTP, firewall, kernel modules)
- Safe cloning process (machine-id regeneration)
- Verified nodes and etcd health