mirror of
https://github.com/RWejlgaard/pez-infra.git
synced 2026-07-04 15:46:16 +00:00
copenhagen-c stopped reporting to Grafana Cloud on 2026-05-20: a transient TLS failure to fleet-management tripped systemd's default start rate-limit, systemd gave up, and the host sat silently unmonitored for ~2.5 weeks. Add a 10-resilience.conf systemd drop-in for alloy.service on every host (StartLimitIntervalSec=0, Restart=always, RestartSec=30) so a momentary upstream/TLS blip can no longer permanently kill the collector. Also drop the old self-hosted Grafana package that was left enabled and failing on copenhagen-c after the move to Grafana Cloud.
19 lines
346 B
YAML
19 lines
346 B
YAML
---
|
|
- name: Restart sshd
|
|
ansible.builtin.service:
|
|
name: sshd
|
|
state: restarted
|
|
|
|
- name: Reload ufw
|
|
community.general.ufw:
|
|
state: reloaded
|
|
|
|
- name: Reload systemd daemon
|
|
ansible.builtin.systemd:
|
|
daemon_reload: true
|
|
|
|
- name: Restart alloy
|
|
ansible.builtin.systemd:
|
|
name: alloy
|
|
state: restarted
|
|
daemon_reload: true
|