Updating docker hosts with Kestra
Patching is a balancing act. There are some who say you shouldn't if it works, others who say you should, but stay a certain about of time behind and those that think that bleed edge is the way to stay protected.
For me, I look at it like this: If I don't update - security issues will creep in and when I do want to update, it'll either be impossible, take too long or because I haven't updated in quite a while: fail with mini errors that I could have corrected if I updated semi-regularly.
Similarly, I don't necessarily want to be on the bleeding edge. I generally look for stable release streams which by the time I see the update has been through extensive testing - probably more that I'll put it through.
So, with this in mind, when it comes to my servers, I use the supported repos for updates, and update my servers usually around once a week. The caveat to this is that if there is a patch which requires a more immediate fix - for example like a vulnerability in SSH on a server that has it exposed to the internet - I'll patch a lot sooner. When I used to be a sysadmin, I'd update the servers on a preset rotation, so I was never updating all the servers of type X at the same time. Usually with a bit of soaktime between them, depending on the update and type.
So, for kestra, I want to stagger my jobs and in this post I'll only be automating the update of my docker hosts. I already have a playbook to do this. At the same time, the playbook checks a few things for me. So my playbook looks like this:
---
- name: "Server Playbook"
hosts: docker
become: yes
become_method: sudo
tasks:
- name: Update and Upgrade Aptitude Packages
apt:
update_cache: yes
upgrade: yes
cache_valid_time: 86400 #One day
- name: "install supporting packages"
apt:
pkg:
- sudo
- tcpdump
- traceroute
- net-tools
- snmpd
- lm-sensors
- nano
- nfs-common
state: present
- name: "Disable turnkey-init-fence.service"
ansible.builtin.systemd:
name: turnkey-init-fence
state: stopped
enabled: false
ignore_errors: true
- name: Enable service snmpd, and not touch the state
ansible.builtin.service:
name: snmpd
enabled: yes
state: started
The "Disable turnkey-init-fence.service" job is required for me, as updating somtimes turns this back on. It is only required if you docker hosts were built with an image from Turnkey.
In order to run the playbook, I have this kestra flow:
id: update_docker
namespace: ansible
description: Update docker hosts
labels:
env: prod
project: ansible
tasks:
- id: docker_update
type: io.kestra.plugin.core.flow.WorkingDirectory
tasks:
- id: ansible_task
namespaceFiles:
enabled: true
include:
- hosts
- docker-initial.yaml
type: io.kestra.plugin.ansible.cli.AnsibleCLI
docker:
image: cytopia/ansible:latest-tools
pullPolicy: IF_NOT_PRESENT
env:
"ANSIBLE_HOST_KEY_CHECKING": "false"
commands:
- apk add sshpass
- ansible-playbook -i hosts docker-initial.yaml
- id: call_outputs_slack-notifer-webhook
type: io.kestra.plugin.core.flow.Subflow
namespace: outputs
flowId: slack-notifier-webhook
inputs:
payload: "Update completed."
wait: true
transmitFailed: false
errors:
- id: server_unreachable
type: io.kestra.plugin.core.flow.Subflow
namespace: outputs
flowId: slack-notifier-webhook
inputs:
payload: "Update had an issue."
wait: true
transmitFailed: false
concurrency:
behavior: CANCEL
limit: 1
Lastly, to trigger jobs at 2am, I have this Kestra flow:
id: 2am_job_start
namespace: ansible
description: Start flows at 2am
labels:
env: prod
project: trigger-wrapper
tasks:
- id: call_update_docker
type: io.kestra.plugin.core.flow.Subflow
namespace: ansible
flowId: update_docker
wait: true
transmitFailed: false
triggers:
- id: 1amStartTime
type: io.kestra.plugin.core.trigger.Schedule
cron: "3 2 * * *"
Next time, I'll take a look at how I keep the rest of my servers and my Raspberry pis.