RWejlgaard/pez-infra

mirror of https://github.com/RWejlgaard/pez-infra.git synced 2026-07-04 15:46:16 +00:00

Author	SHA1	Message	Date
Rasmus "Pez" Wejlgaard	9ac179dbec	Make Alloy resilient to transient failures; remove leftover Grafana (PESO-149) (#126 ) copenhagen-c stopped reporting to Grafana Cloud on 2026-05-20: a transient TLS failure to fleet-management tripped systemd's default start rate-limit, systemd gave up, and the host sat silently unmonitored for ~2.5 weeks. Add a 10-resilience.conf systemd drop-in for alloy.service on every host (StartLimitIntervalSec=0, Restart=always, RestartSec=30) so a momentary upstream/TLS blip can no longer permanently kill the collector. Also drop the old self-hosted Grafana package that was left enabled and failing on copenhagen-c after the move to Grafana Cloud.	2026-06-07 14:30:08 +01:00
Rasmus "Pez" Wejlgaard	81efa1b717	Remove stale cloudflared service from copenhagen-a (PESO-138) (#125 ) Some checks are pending Deploy (on merge) / Discover hosts (push) Waiting to run Details Deploy (on merge) / deploy (push) Blocked by required conditions Details cloudflared was retired in #56 when Caddy + Authelia replaced Cloudflare Tunnels, but copenhagen-a was unreachable at the time so its cloudflared.service was never stopped and is still running. Add a cleanup task to the common role that stops, disables and purges cloudflared wherever the unit lingers. Gated on the unit file existing so it self-targets copenhagen-a and is a no-op everywhere else, and explicitly excludes copenhagen-c, which legitimately runs a hand-configured tunnel.	2026-06-07 11:45:35 +01:00
Rasmus "Pez" Wejlgaard	4554dec7d2	Remove unused Prometheus alerting config (#10 ) * Configure UFW firewall rules in common Ansible role Add UFW configuration to the common role for Debian hosts: - Default deny incoming, allow outgoing - Allow all traffic on tailscale0 interface (mesh comms) - Allow SSH port 22 as safety net - Per-host allowed ports via ufw_allowed_ports variable - Enable UFW after rules are applied helsinki-a gets ports 80/443 for reverse proxy traffic. Other Debian hosts only need Tailscale + SSH. Closes PESO-79 * Remove unused alerting and rule_files from prometheus.yml Alerting is handled by Grafana, not Prometheus Alertmanager. The empty alertmanagers and rule_files sections were just noise. Resolves PESO-74	2026-03-29 10:37:25 +01:00
Rasmus Wejlgaard	737d6e0bc1	initial commit	2026-03-28 12:39:41 +00:00

4 commits