Back to Home

Kubernetes Cluster Automation with Ansible

Complete beginner's guide to automating HA Kubernetes cluster deployment

New to Ansible? Perfect!

This guide assumes you're new to Ansible and will teach you everything from basics to automating a complete Kubernetes HA cluster. By the end, you'll be able to deploy production-ready clusters with a single command!

What is Ansible?

Ansible is an automation tool that helps you configure servers, deploy applications, and manage infrastructure without writing complex scripts. Think of it as a recipe book for your servers!

Why Use Ansible for Kubernetes?

Key Concept: How Ansible Works

Ansible connects to your servers via SSH, runs commands, and ensures everything is configured as specified. No agents or daemons required!

Ansible Basics for Beginners

Core Components

1. Inventory File

Lists all your servers (hosts) organized into groups. Example: masters, workers, load balancers.

2. Playbook

A YAML file containing tasks (instructions) to execute on your servers. Like a recipe with steps.

3. Tasks

Individual actions like "install package", "copy file", "restart service".

4. Roles

Organized collections of tasks, templates, and files for specific purposes (e.g., "kubernetes-master" role).

5. Variables

Dynamic values you can change without modifying the playbook (e.g., IP addresses, versions).

Prerequisites

Control Node (Your Machine)

Target Servers

1 Install Ansible on Control Node

Ubuntu/Debian:

sudo apt update
sudo apt install -y software-properties-common
sudo add-apt-repository --yes --update ppa:ansible/ansible
sudo apt install -y ansible

macOS:

brew install ansible

Verify Installation:

ansible --version

2 Generate and Copy SSH Keys

Generate SSH Key (if you don't have one):

ssh-keygen -t rsa -b 4096 -C "ansible@k8s-cluster"

Copy SSH key to all servers:

# Replace with your server IPs
ssh-copy-id user@master1-ip
ssh-copy-id user@master2-ip
ssh-copy-id user@master3-ip
ssh-copy-id user@worker1-ip
ssh-copy-id user@worker2-ip
ssh-copy-id user@worker3-ip
ssh-copy-id user@lb1-ip
ssh-copy-id user@lb2-ip

Test SSH Connection:

ssh user@master1-ip

Project Structure

Create a directory structure for your Ansible project:

k8s-ansible/
├── inventory/
│   └── hosts.ini              # Server inventory
├── group_vars/
│   └── all.yml               # Global variables
├── roles/
│   ├── common/               # Common setup for all nodes
│   ├── haproxy/              # Load balancer configuration
│   ├── k8s-master/           # Master node setup
│   └── k8s-worker/           # Worker node setup
├── playbooks/
│   ├── site.yml              # Main playbook
│   ├── setup-haproxy.yml     # Load balancer playbook
│   ├── setup-masters.yml     # Master nodes playbook
│   └── setup-workers.yml     # Worker nodes playbook
└── ansible.cfg               # Ansible configuration

3 Create Project Structure

mkdir -p k8s-ansible/{inventory,group_vars,roles,playbooks}
cd k8s-ansible

Step 1: Create Inventory File

The inventory file tells Ansible which servers to manage.

inventory/hosts.ini
[load_balancers]
lb1 ansible_host=192.168.1.10 ansible_user=ubuntu
lb2 ansible_host=192.168.1.11 ansible_user=ubuntu

[masters]
master1 ansible_host=192.168.1.101 ansible_user=ubuntu
master2 ansible_host=192.168.1.102 ansible_user=ubuntu
master3 ansible_host=192.168.1.103 ansible_user=ubuntu

[workers]
worker1 ansible_host=192.168.1.201 ansible_user=ubuntu
worker2 ansible_host=192.168.1.202 ansible_user=ubuntu
worker3 ansible_host=192.168.1.203 ansible_user=ubuntu

[k8s_cluster:children]
masters
workers

[all:vars]
ansible_python_interpreter=/usr/bin/python3

Action Required

Replace the IP addresses with your actual server IPs and adjust ansible_user to match your SSH username.

Test Inventory:

ansible all -i inventory/hosts.ini -m ping

Step 2: Define Variables

group_vars/all.yml
---
# Kubernetes Configuration
k8s_version: "1.29.0-00"
pod_network_cidr: "10.244.0.0/16"
service_cidr: "10.96.0.0/12"

# Load Balancer Configuration
lb_vip: "192.168.1.100"  # Virtual IP for HA
lb_port: 6443

# Container Runtime
container_runtime: containerd

# CNI Plugin (calico or flannel)
cni_plugin: calico

# Node Configuration
disable_swap: true
enable_firewalld: false

Step 3: Create Common Role

This role prepares all nodes with basic requirements.

Create Role Structure

mkdir -p roles/common/tasks
roles/common/tasks/main.yml
---
- name: Update apt cache
  apt:
    update_cache: yes
    cache_valid_time: 3600

- name: Disable swap
  shell: |
    swapoff -a
    sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
  when: disable_swap

- name: Load kernel modules
  copy:
    dest: /etc/modules-load.d/k8s.conf
    content: |
      overlay
      br_netfilter

- name: Enable kernel modules
  modprobe:
    name: "{{ item }}"
  loop:
    - overlay
    - br_netfilter

- name: Configure sysctl
  copy:
    dest: /etc/sysctl.d/k8s.conf
    content: |
      net.bridge.bridge-nf-call-iptables  = 1
      net.bridge.bridge-nf-call-ip6tables = 1
      net.ipv4.ip_forward                 = 1

- name: Apply sysctl
  command: sysctl --system

- name: Install containerd
  apt:
    name: containerd
    state: present

- name: Create containerd config directory
  file:
    path: /etc/containerd
    state: directory

- name: Generate containerd config
  shell: containerd config default > /etc/containerd/config.toml

- name: Enable SystemdCgroup in containerd
  replace:
    path: /etc/containerd/config.toml
    regexp: 'SystemdCgroup = false'
    replace: 'SystemdCgroup = true'

- name: Restart containerd
  systemd:
    name: containerd
    state: restarted
    enabled: yes

- name: Add Kubernetes apt repository
  block:
    - name: Install prerequisites
      apt:
        name:
          - apt-transport-https
          - ca-certificates
          - curl
          - gpg
        state: present

    - name: Add Kubernetes GPG key
      shell: |
        curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

    - name: Add Kubernetes repository
      apt_repository:
        repo: "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /"
        filename: kubernetes

- name: Install Kubernetes packages
  apt:
    name:
      - kubelet={{ k8s_version }}
      - kubeadm={{ k8s_version }}
      - kubectl={{ k8s_version }}
    state: present
    update_cache: yes

- name: Hold Kubernetes packages
  dpkg_selections:
    name: "{{ item }}"
    selection: hold
  loop:
    - kubelet
    - kubeadm
    - kubectl

Step 4: Create HAProxy Role

Create Role Structure

mkdir -p roles/haproxy/{tasks,templates}
roles/haproxy/tasks/main.yml
---
- name: Install HAProxy
  apt:
    name: haproxy
    state: present

- name: Configure HAProxy
  template:
    src: haproxy.cfg.j2
    dest: /etc/haproxy/haproxy.cfg
  notify: Restart HAProxy

- name: Enable HAProxy
  systemd:
    name: haproxy
    enabled: yes
    state: started
roles/haproxy/templates/haproxy.cfg.j2
global
    log /dev/log local0
    log /dev/log local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

defaults
    log global
    mode tcp
    option tcplog
    option dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000

frontend kubernetes_frontend
    bind *:{{ lb_port }}
    mode tcp
    option tcplog
    default_backend kubernetes_backend

backend kubernetes_backend
    mode tcp
    option tcp-check
    balance roundrobin
{% for host in groups['masters'] %}
    server {{ host }} {{ hostvars[host]['ansible_host'] }}:6443 check fall 3 rise 2
{% endfor %}
roles/haproxy/handlers/main.yml
---
- name: Restart HAProxy
  systemd:
    name: haproxy
    state: restarted

Step 5: Create K8s Master Role

Create Role Structure

mkdir -p roles/k8s-master/tasks
roles/k8s-master/tasks/main.yml
---
- name: Check if cluster is initialized
  stat:
    path: /etc/kubernetes/admin.conf
  register: k8s_initialized

- name: Initialize first master
  shell: |
    kubeadm init       --control-plane-endpoint="{{ lb_vip }}:{{ lb_port }}"       --upload-certs       --pod-network-cidr={{ pod_network_cidr }}
  when: 
    - inventory_hostname == groups['masters'][0]
    - not k8s_initialized.stat.exists
  register: kubeadm_init

- name: Save join commands
  copy:
    content: "{{ kubeadm_init.stdout }}"
    dest: /tmp/kubeadm-init-output.txt
  when: 
    - inventory_hostname == groups['masters'][0]
    - kubeadm_init.changed

- name: Create .kube directory
  file:
    path: "{{ ansible_env.HOME }}/.kube"
    state: directory
    mode: '0755'

- name: Copy admin.conf
  copy:
    src: /etc/kubernetes/admin.conf
    dest: "{{ ansible_env.HOME }}/.kube/config"
    remote_src: yes
    mode: '0600'

- name: Install Calico CNI
  shell: |
    kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml
  when: 
    - inventory_hostname == groups['masters'][0]
    - cni_plugin == "calico"
    - kubeadm_init.changed

Step 6: Create Main Playbook

playbooks/site.yml
---
- name: Setup HAProxy Load Balancers
  hosts: load_balancers
  become: yes
  roles:
    - haproxy

- name: Setup all Kubernetes nodes
  hosts: k8s_cluster
  become: yes
  roles:
    - common

- name: Setup Kubernetes Master Nodes
  hosts: masters
  become: yes
  roles:
    - k8s-master

- name: Setup Kubernetes Worker Nodes
  hosts: workers
  become: yes
  tasks:
    - name: Message for manual join
      debug:
        msg: "Worker nodes need to be joined manually using the join command from /tmp/kubeadm-init-output.txt on master1"

Step 7: Configure Ansible

ansible.cfg
[defaults]
inventory = inventory/hosts.ini
host_key_checking = False
retry_files_enabled = False
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_facts
fact_caching_timeout = 3600

[privilege_escalation]
become = True
become_method = sudo
become_user = root
become_ask_pass = False

Step 8: Deploy Your Cluster!

1 Dry Run (Check Mode)

Test without making changes:

ansible-playbook playbooks/site.yml --check

2 Execute Playbook

Deploy the cluster:

ansible-playbook playbooks/site.yml

3 Get Join Commands

SSH to master1 and retrieve join commands:

ssh user@master1-ip
cat /tmp/kubeadm-init-output.txt

4 Verify Cluster

kubectl get nodes
kubectl get pods -n kube-system

Congratulations!

You've successfully automated a Kubernetes HA cluster deployment using Ansible! 🎉

Advanced Tips

Using Ansible Vault for Secrets

Encrypt sensitive data like passwords:

# Create encrypted file
ansible-vault create group_vars/secrets.yml

# Edit encrypted file
ansible-vault edit group_vars/secrets.yml

# Run playbook with vault
ansible-playbook playbooks/site.yml --ask-vault-pass

Running Specific Tasks

Use tags to run only certain parts:

# Run only load balancer setup
ansible-playbook playbooks/site.yml --tags "haproxy"

# Skip certain tasks
ansible-playbook playbooks/site.yml --skip-tags "cni"

Version Control

Store your playbooks in Git:

git init
git add .
git commit -m "Initial Kubernetes automation setup"
git remote add origin 
git push -u origin main

Troubleshooting

Issue Solution
SSH connection failed Check SSH keys with ansible all -m ping
Permission denied Ensure user has sudo privileges without password
Task failed on certain hosts Run with -vvv for verbose output
Playbook hangs Check firewall rules and network connectivity

Debug Commands:

# Test connectivity
ansible all -m ping

# Run with verbose output
ansible-playbook playbooks/site.yml -vvv

# Check syntax
ansible-playbook playbooks/site.yml --syntax-check

# List hosts
ansible all --list-hosts

Next Steps