* Grafana Cloud migration, adding dashboards, fleet, alloy and synthetics
* modulize stuff now that we have multiple substantial things in here
* provider updates and new secrets
* remove grafana and prometheus from ansible
* fix: actually decomission nextcloud and TWDNE
* ignore spaces in lint and remove dns for the services
* linting on the linting config wasn't linting the lints
node_exporter is deployed by the dedicated node_exporter Ansible role
using distro packages (prometheus-node-exporter). Having it in
systemd_services causes the systemd_services role to look for a
non-existent services/node_exporter/node_exporter.service file,
producing errors during deploy.
Resolves PESO-135
Sonarr is running on london-b as an apt-managed systemd service
but was the only *arr service without a services/ directory in the
repo. Add services/sonarr/README.md documenting the install method,
data paths, and how it differs from the other *arr services.
Closes PESO-133
Cloudflared tunnels are no longer used. All traffic now routes through
Cloudflare DNS to Caddy on helsinki-a over Tailscale.
- Remove cloudflared systemd unit files (copenhagen-a, london-b)
- Remove cloudflared from media_stack role and copenhagen-a host_vars
- Remove cloudflared references from services README and host docs
- Remove cloudflared deploy trigger from CI workflow
Live service on london-b stopped and disabled. copenhagen-a was
unreachable but the tunnel is unused regardless.
The lint-docker-compose workflow was swallowing all validation errors with
|| true, meaning broken compose files would never fail the check.
- Remove || true and let validation failures propagate
- Add a pre-step that creates empty stubs for referenced env_file entries
(e.g. bitwarden/settings.env) so docker compose config can validate
structure without needing real secrets
- Track per-file pass/fail and exit non-zero if any file fails
Closes PESO-130
grafana.ini on london-a sets provisioning = /usr/local/etc/grafana/provisioning
but grafana_provisioning_dir pointed at /usr/local/share/grafana/conf/provisioning.
This meant deploy.yml synced alerting rules, dashboards provisioning, and
datasources to a path Grafana never reads — a from-scratch deploy would have
broken alerting entirely.
Fixes PESO-131
- Add copenhagen-a to [docker_hosts] inventory group so the docker role
runs on it in Stage 2
- Add docker_services: [minecraft] to copenhagen-a host_vars
- Add docker_services role to Stage 4d (copenhagen-a) in deploy.yml
- Update deploy-on-merge scope mapping to include copenhagen-a for
docker role changes
Closes PESO-132
cloudflared has been replaced by Caddy + Authelia. Removed:
- cloudflared service config (services/cloudflared/london-a/)
- tunnel ID from london-a host_vars
- cloudflared_enable from rc.conf
Also synced rc.conf with live server state (disabled services
from PESO-113, added node_exporter_listen_address).
Live server: stopped service, removed from rc.conf, uninstalled pkg.
* Add systemd_exporter Ansible role and Prometheus scrape config
- Create systemd_exporter role (download binary, create user, deploy service)
- Add scrape job for london-b:9558 and copenhagen-a:9558
- Add systemd_exporter_hosts inventory group
- Add stage 3b to deploy.yml
- Map role to deploy-on-merge scope
Closes PESO-120
* Fix line length lint violations in systemd_exporter tasks
* Fix var-naming lint: use systemd_exporter_ prefix for role variables