pez-docs/workloads/monitoring/README.md
Pez 8e7269611d Update docs to reflect current setup (March 2026)
- Add Hetzner Cloud location (helsinki-a, nuremberg-a)
- Update london-a to FreeBSD, london-b ZFS layout to 3x raidz1
- Note offline servers (london-c, copenhagen-b)
- Update Plex docs with accurate ZFS and exporter behaviour
- Add workload docs: Nextcloud AIO, Navidrome, slskd, Monitoring,
  Auth (Authelia/LLDAP/Bitwarden), Mail (poste.io), Gaming (Minecraft/MaNGOS)
- Update README/intro with current service and location index
2026-03-04 09:09:08 +00:00

23 lines
1.1 KiB
Markdown

# Monitoring
## Stack
The monitoring stack runs on `london-a` — a FreeBSD machine dedicated to observability. The choice of FreeBSD here is deliberate: it's lightweight, stable, and well-suited for a machine whose job is to just sit there and watch things.
- **Prometheus** — scrapes metrics from all servers and services
- **Grafana** — dashboards and visualisation
- **node_exporter** — system metrics on each Linux/FreeBSD server
- **smartctl_exporter** — disk health metrics from `london-b` (Docker)
- **prom-plex-exporter** — Plex session and library metrics from `london-b` (Docker)
## What Gets Scraped
All servers in the homelab run `node_exporter` and are reachable by Prometheus via Tailscale. Prometheus scrapes each target over the Tailscale network, so nothing needs a public port.
## Dashboards
Grafana is accessible via Cloudflare tunnel + Authelia for SSO. There's also a refurbished tablet mounted on the fridge in the living room showing a few key dashboards — a quick way to see if everything is healthy without opening a browser.
## Alerting
Not yet configured. This is a gap worth filling.