mirror of
https://github.com/RWejlgaard/pez-infra.git
synced 2026-07-04 15:46:16 +00:00
The docs still described Cloudflare as DNS + CDN in front of helsinki-a, but that was dropped in #90 - pez.sh lives on Hetzner DNS via Terraform now and records point straight at the origin. Updated README, architecture, networking, getting-started and the nuremberg-a host doc to match, and noted that pez.solutions still resolves via Cloudflare outside Terraform. Also fixed while I was in there: - terraform/README: PagerDuty provider is ~> 3.32 (table said ~> 2.2), and the B2 secret keys are backblaze_keyID/backblaze_applicationKey - secrets docs: group_vars secrets file is .enc.yaml, dropped the FreeBSD install steps, the long-gone .sops.yaml placeholder note and the ANSIBLE_VAULT_PASS migration note, swapped the cloudflare_record example for hcloud - getting-started referenced ansible/scripts/sops-setup.sh which doesn't exist - added naveen.pez.sh to the subdomain tables and a note about the DNS-only records (mail, minecraft, wow, public)
84 lines
4.8 KiB
Markdown
84 lines
4.8 KiB
Markdown
# Architecture
|
|
|
|
## Overview
|
|
|
|
The infrastructure spans three physical locations (London, Copenhagen, Hetzner Cloud) connected by a Tailscale mesh network. All public traffic enters through a single Hetzner Cloud VPS (helsinki-a) running Caddy as a reverse proxy, which forwards requests over Tailscale to backend services running on physical servers in London and Copenhagen.
|
|
|
|
The setup is entirely self-hosted (with the exception of Hetzner Cloud VPSs, Hetzner DNS, and Grafana Cloud for observability). Most physical servers are old personal computers repurposed into server duty — cheaper than cloud, and I get a rack cabinet that doubles as a bedroom white noise machine.
|
|
|
|
## Network Topology
|
|
|
|
```mermaid
|
|
graph TD
|
|
DNS["<b>DNS</b><br/>Hetzner DNS: *.pez.sh<br/>Cloudflare: *.pez.solutions"]
|
|
DNS -->|HTTPS| HEL
|
|
|
|
HEL["<b>helsinki-a</b><br/>Hetzner Cloud VPS<br/><br/>Caddy (reverse proxy)<br/>Authelia (SSO)<br/>LLDAP (Authelia backend)<br/>Bitwarden (Vaultwarden)<br/>Forgejo"]
|
|
|
|
HEL --> TS["<b>Tailscale Mesh</b><br/>WireGuard-based VPN"]
|
|
|
|
TS --> LB["<b>london-b</b><br/>Storage / Media<br/>*arr stack, Plex, Jellyfin<br/>(Threadripper, 87T ZFS)"]
|
|
TS --> LA["<b>london-a</b><br/>Proxmox VE hypervisor<br/>(Debian 13)"]
|
|
TS --> LC["<b>london-c</b><br/>Raspberry Pi<br/>Octopus Energy exporter"]
|
|
TS --> NA["<b>nuremberg-a</b><br/>Mail<br/>poste.io"]
|
|
TS --> CA["<b>copenhagen-a</b><br/>Gaming<br/>Minecraft / WoW (MaNGOS)"]
|
|
TS --> CC["<b>copenhagen-c</b><br/>Raspberry Pi<br/>cloudflared, idle"]
|
|
|
|
TS -.->|Alloy| GC["<b>Grafana Cloud</b><br/>metrics, logs, traces<br/>synthetic checks"]
|
|
|
|
style CC stroke-dasharray: 5 5
|
|
```
|
|
|
|
## Traffic Flow
|
|
|
|
All public-facing services follow the same pattern:
|
|
|
|
```
|
|
User → DNS (Hetzner DNS) → helsinki-a (Caddy, TLS) → Backend (over Tailscale)
|
|
```
|
|
|
|
1. DNS for `pez.sh` is managed by Hetzner DNS (provisioned via Terraform, `terraform/hetzner/dns.tf`); `pez.solutions` still resolves via Cloudflare (dashboard-managed)
|
|
2. Records point directly at helsinki-a's public IP — no CDN or proxying in front
|
|
3. Caddy on helsinki-a terminates TLS (Let's Encrypt) and routes to the correct backend
|
|
4. For protected services, Caddy calls Authelia first (`forward_auth`)
|
|
5. If authenticated (or no auth required), traffic is proxied over Tailscale to the backend
|
|
|
|
```mermaid
|
|
graph LR
|
|
subgraph "helsinki-a (Caddy)"
|
|
A1["forward_auth → Authelia"]
|
|
A2["(no auth)"]
|
|
A3["forward_auth → Authelia"]
|
|
A4["(local)"]
|
|
end
|
|
|
|
R["radarr.pez.sh"] --> A1 --> LB1["london-b:7878"]
|
|
J["jellyfin.pez.sh"] --> A2 --> LB2["london-b:8096"]
|
|
G["git.pez.sh"] --> A3 --> LO3["localhost:3000 (Forgejo)"]
|
|
AU["auth.pez.sh"] --> A4 --> LO["localhost:9091 (Authelia)"]
|
|
```
|
|
|
|
## Auth Architecture
|
|
|
|
```mermaid
|
|
graph TD
|
|
Caddy["<b>Caddy</b><br/>forward_auth"] --> Authelia["<b>Authelia</b><br/>SSO<br/>auth.pez.sh"]
|
|
Authelia --> LLDAP["<b>LLDAP</b><br/>User directory<br/>(Authelia backend only)"]
|
|
Authelia --> MariaDB["<b>MariaDB</b><br/>Authelia session/state"]
|
|
```
|
|
|
|
Authelia authenticates against LLDAP and uses a MariaDB for session/state. All three run as Docker containers on helsinki-a. LLDAP is **not** wired into other apps — it's purely Authelia's user backend. Services that sit behind Authelia inherit users from LLDAP via the Caddy `forward_auth` flow; services with their own auth (Bitwarden, Plex, Jellyfin, Navidrome, Jellyseerr, Forgejo, poste.io) maintain their own user databases.
|
|
|
|
## Observability
|
|
|
|
Metrics, logs, and traces ship to **Grafana Cloud** from every host via **Grafana Alloy**. The Alloy collectors are registered in Grafana Fleet Management (configured in `terraform/grafana/`). Synthetic uptime checks for the public sites run from Grafana Cloud probes, and PagerDuty handles alert delivery.
|
|
|
|
> **History:** Monitoring used to run locally on london-a (FreeBSD, with Prometheus + Grafana). london-a has since been wiped and reinstalled as Proxmox VE; the local stack was retired in favour of Grafana Cloud. See [monitoring.md](monitoring.md) for the current setup.
|
|
|
|
## Design Principles
|
|
|
|
- **Self-hosted first.** Cloud VPSs only where it makes sense (public gateway, mail with clean IP reputation). Everything else runs on physical hardware I own.
|
|
- **Tailscale as the backbone.** No ports exposed on residential IPs. All inter-server communication goes over the mesh.
|
|
- **Ansible for everything.** If a server dies, reinstall the OS, install Tailscale, run `make deploy`. Roughly 30 minutes to full recovery.
|
|
- **Terraform for cloud + DNS.** Hetzner servers, DNS records, Grafana Cloud configuration, and PagerDuty are all in code. No clicking around in dashboards.
|
|
- **Cattle, not pets (as much as possible).** The servers are technically pets — old hardware in specific locations — but the configs are cattle. Everything is reproducible from this repo.
|