mirror of https://github.com/RWejlgaard/pez-infra.git synced 2026-07-04 15:46:16 +00:00

History

Rasmus "Pez" Wejlgaard 9815f44b84 Some checks failed Deploy (on merge) / Discover hosts (push) Has been cancelled Details Deploy (on merge) / deploy (push) Has been cancelled Details fix: stop masking failed service deploys; trim dead config (#119 ) The docker_services and systemd_services roles ran their "start the service" tasks with `failed_when: false`, so a container or unit that failed to come up still reported the deploy as green. Drop it from both start tasks so a broken deploy actually fails CI. The compose/unit copy tasks keep `failed_when: false` — that's load-bearing for the `item is not failed` filter that skips services without a compose/unit file. Also: - Remove a duplicate "Template service .env files" task in docker_services (second copy used a hardcoded path and didn't register; first one is the one the start task reads). - Don't trigger a full fleet deploy on docs/markdown/workflow-only pushes to main — add docs/, /.md and .github/* to paths-ignore. - Drop the dangling `update-freebsd` Make target (playbook doesn't exist; fleet has no FreeBSD hosts).		2026-06-04 18:41:24 +01:00
..
dotfiles	remove pr-test.yml	2026-03-28 13:11:34 +00:00
group_vars/all	fix: add smb mount (#107 )	2026-05-14 20:49:25 +01:00
inventory	fix: update config for london-a for new proxmox install (#101 )	2026-05-09 19:22:34 +01:00
playbooks	fix: cleanup freebsd and alpine stuff (#105 )	2026-05-12 22:43:12 +01:00
roles	fix: stop masking failed service deploys; trim dead config (#119 )	2026-06-04 18:41:24 +01:00
scripts	only send email if something went wrong with backups (#60 )	2026-04-06 18:33:07 +01:00
services	fix: update octopus exporter (#113 )	2026-05-26 20:56:07 +01:00
.ansible-lint	fix: actually decomission nextcloud and TWDNE (#72 )	2026-04-25 18:19:16 +01:00
.yamllint	ignore all SOPS-encrypted files in yamllint	2026-03-28 18:50:08 +00:00
ansible.cfg	adding london-c (#66 )	2026-04-20 20:52:19 +01:00
deploy.yml	fix: cleanup deploy.yml and share workflow (#108 )	2026-05-15 20:17:28 +01:00
Makefile	fix: stop masking failed service deploys; trim dead config (#119 )	2026-06-04 18:41:24 +01:00
README.md	fix: Documentation overhaul (#112 )	2026-05-19 18:49:21 +01:00
requirements.yml	initial commit	2026-03-28 12:39:41 +00:00

README.md

Ansible — Deploy & Maintain

One-command deploy playbook for rebuilding hosts from repo state.

Quick Start

cd ansible/

# Install dependencies
make deps

# Dry run — see what would change
make deploy-check

# Deploy everything
make deploy

# Deploy a single host
make deploy-host HOST=helsinki-a

Playbooks

Playbook	Purpose	Usage
`deploy.yml`	Full host rebuild from repo	`make deploy` or `--limit <host>`
`playbooks/update-all.yml`	OS package updates (all hosts, apt)	`make update-all`
`playbooks/update-linux.yml`	Alias for update-all (apt)	`make update-linux`
`playbooks/docker-status.yml`	Show running containers	`make docker-status`
`playbooks/reboot.yml`	Safe reboot with pre-flight	`make reboot HOST=<host>`
`playbooks/zfs.yml`	ZFS scrub scheduling (london-b)	`ansible-playbook playbooks/zfs.yml`

Deploy Stages

The deploy playbook runs in stages, each independently taggable (see deploy.yml):

common / baseline — Baseline packages, SSH hardening, fish shell, dotfiles
docker — Docker engine on container hosts (docker_hosts group)
services — Per-host service deployment:
- helsinki-a: Caddy + status-page + custom systemd units
- docker_hosts: Docker Compose stacks from services/
- nuremberg-a: poste.io mail (Docker)
- london-b: media_stack + backup (rclone to B2)
- copenhagen-a: MaNGOS systemd units + MariaDB
- london-a: proxmox_ve (apt repo, nag patch, CIFS storage)
- zfs_hosts: ZFS scrub scheduling

Observability (node_exporter, systemd_exporter, Grafana Alloy) is part of the common baseline — every host gets it.

Run a single stage: ansible-playbook deploy.yml --tags docker

Roles

Role	Description
`common`	Base packages, SSH hardening, fish shell, exporters, Alloy
`dotfiles`	Shell config from `dotfiles/`
`docker`	Docker engine install and setup
`docker_services`	Deploy compose files from `services/`
`caddy`	Caddy reverse proxy (helsinki-a)
`status_page`	status.pez.sh generator script + cron
`systemd_services`	Custom systemd units from `services/`
`media_stack`	*Arr stack, Plex/Jellyfin, Samba, Syncthing on london-b
`backup`	rclone-to-B2 cron job on london-b
`mariadb`	Native MariaDB (used by MaNGOS on copenhagen-a)
`proxmox_ve`	PVE no-subscription repo, UI lockdown, CIFS storage
`zfs`	Weekly scrub cron on ZFS hosts

Inventory

Hosts are grouped by OS and role. All use Tailscale IPs, SSH as root. Per-host variables in inventory/host_vars/<hostname>.yml.

Safety Notes

london-b: Reboot playbook requires interactive confirmation (critical storage)
copenhagen-a: Reboot includes netplan pre-flight check (static IP verification)
All playbooks use ignore_unreachable: true for fleet operations
--check --diff is your friend — always dry-run first on production