pez-docs/workloads/monitoring/README.md
Pez 8e7269611d Update docs to reflect current setup (March 2026)
- Add Hetzner Cloud location (helsinki-a, nuremberg-a)
- Update london-a to FreeBSD, london-b ZFS layout to 3x raidz1
- Note offline servers (london-c, copenhagen-b)
- Update Plex docs with accurate ZFS and exporter behaviour
- Add workload docs: Nextcloud AIO, Navidrome, slskd, Monitoring,
  Auth (Authelia/LLDAP/Bitwarden), Mail (poste.io), Gaming (Minecraft/MaNGOS)
- Update README/intro with current service and location index
2026-03-04 09:09:08 +00:00

1.1 KiB

Monitoring

Stack

The monitoring stack runs on london-a — a FreeBSD machine dedicated to observability. The choice of FreeBSD here is deliberate: it's lightweight, stable, and well-suited for a machine whose job is to just sit there and watch things.

  • Prometheus — scrapes metrics from all servers and services
  • Grafana — dashboards and visualisation
  • node_exporter — system metrics on each Linux/FreeBSD server
  • smartctl_exporter — disk health metrics from london-b (Docker)
  • prom-plex-exporter — Plex session and library metrics from london-b (Docker)

What Gets Scraped

All servers in the homelab run node_exporter and are reachable by Prometheus via Tailscale. Prometheus scrapes each target over the Tailscale network, so nothing needs a public port.

Dashboards

Grafana is accessible via Cloudflare tunnel + Authelia for SSO. There's also a refurbished tablet mounted on the fridge in the living room showing a few key dashboards — a quick way to see if everything is healthy without opening a browser.

Alerting

Not yet configured. This is a gap worth filling.