pez-infra/ansible
Rasmus Wejlgaard 0ba6d6daff Ansible-manage docker-log-cleanup script and cron (PESO-142)
docker-log-cleanup.sh lived in the repo but nothing deployed it — the
script and monthly cron on nuremberg-a were set up by hand and got wiped
when the host was reinstalled. Fold both into the docker role so every
docker_hosts member gets the script in /usr/local/bin and a monthly cron,
and it survives a rebuild.
2026-06-08 18:36:47 +01:00
..
dotfiles remove pr-test.yml 2026-03-28 13:11:34 +00:00
group_vars/all fix: add smb mount (#107) 2026-05-14 20:49:25 +01:00
inventory remove miniflux — decommissioned (#127) 2026-06-07 18:07:11 +01:00
playbooks fix: cleanup freebsd and alpine stuff (#105) 2026-05-12 22:43:12 +01:00
roles Ansible-manage docker-log-cleanup script and cron (PESO-142) 2026-06-08 18:36:47 +01:00
scripts Ansible-manage docker-log-cleanup script and cron (PESO-142) 2026-06-08 18:36:47 +01:00
services remove miniflux — decommissioned (#127) 2026-06-07 18:07:11 +01:00
.ansible-lint fix: actually decomission nextcloud and TWDNE (#72) 2026-04-25 18:19:16 +01:00
.yamllint ignore all SOPS-encrypted files in yamllint 2026-03-28 18:50:08 +00:00
ansible.cfg adding london-c (#66) 2026-04-20 20:52:19 +01:00
deploy.yml fix: cleanup deploy.yml and share workflow (#108) 2026-05-15 20:17:28 +01:00
Makefile fix: stop masking failed service deploys; trim dead config (#119) 2026-06-04 18:41:24 +01:00
README.md Ansible-manage docker-log-cleanup script and cron (PESO-142) 2026-06-08 18:36:47 +01:00
requirements.yml initial commit 2026-03-28 12:39:41 +00:00

Ansible — Deploy & Maintain

One-command deploy playbook for rebuilding hosts from repo state.

Quick Start

cd ansible/

# Install dependencies
make deps

# Dry run — see what would change
make deploy-check

# Deploy everything
make deploy

# Deploy a single host
make deploy-host HOST=helsinki-a

Playbooks

Playbook Purpose Usage
deploy.yml Full host rebuild from repo make deploy or --limit <host>
playbooks/update-all.yml OS package updates (all hosts, apt) make update-all
playbooks/update-linux.yml Alias for update-all (apt) make update-linux
playbooks/docker-status.yml Show running containers make docker-status
playbooks/reboot.yml Safe reboot with pre-flight make reboot HOST=<host>
playbooks/zfs.yml ZFS scrub scheduling (london-b) ansible-playbook playbooks/zfs.yml

Deploy Stages

The deploy playbook runs in stages, each independently taggable (see deploy.yml):

  1. common / baseline — Baseline packages, SSH hardening, fish shell, dotfiles
  2. docker — Docker engine on container hosts (docker_hosts group)
  3. services — Per-host service deployment:
    • helsinki-a: Caddy + status-page + custom systemd units
    • docker_hosts: Docker Compose stacks from services/
    • nuremberg-a: poste.io mail (Docker)
    • london-b: media_stack + backup (rclone to B2)
    • copenhagen-a: MaNGOS systemd units + MariaDB
    • london-a: proxmox_ve (apt repo, nag patch, CIFS storage)
    • zfs_hosts: ZFS scrub scheduling

Observability (node_exporter, systemd_exporter, Grafana Alloy) is part of the common baseline — every host gets it.

Run a single stage: ansible-playbook deploy.yml --tags docker

Roles

Role Description
common Base packages, SSH hardening, fish shell, exporters, Alloy
dotfiles Shell config from dotfiles/
docker Docker engine install and setup + monthly log-cleanup cron
docker_services Deploy compose files from services/
caddy Caddy reverse proxy (helsinki-a)
status_page status.pez.sh generator script + cron
systemd_services Custom systemd units from services/
media_stack *Arr stack, Plex/Jellyfin, Samba, Syncthing on london-b
backup rclone-to-B2 cron job on london-b
mariadb Native MariaDB (used by MaNGOS on copenhagen-a)
proxmox_ve PVE no-subscription repo, UI lockdown, CIFS storage
zfs Weekly scrub cron on ZFS hosts

Inventory

Hosts are grouped by OS and role. All use Tailscale IPs, SSH as root. Per-host variables in inventory/host_vars/<hostname>.yml.

Safety Notes

  • london-b: Reboot playbook requires interactive confirmation (critical storage)
  • copenhagen-a: Reboot includes netplan pre-flight check (static IP verification)
  • All playbooks use ignore_unreachable: true for fleet operations
  • --check --diff is your friend — always dry-run first on production