Mono-repo for my server stack
Find a file
Rasmus Wejlgaard 89b21fd6fc fix: stop masking failed service deploys; trim dead config
The docker_services and systemd_services roles ran their "start the
service" tasks with `failed_when: false`, so a container or unit that
failed to come up still reported the deploy as green. Drop it from both
start tasks so a broken deploy actually fails CI. The compose/unit *copy*
tasks keep `failed_when: false` — that's load-bearing for the
`item is not failed` filter that skips services without a compose/unit file.

Also:
- Remove a duplicate "Template service .env files" task in docker_services
  (second copy used a hardcoded path and didn't register; first one is the
  one the start task reads).
- Don't trigger a full fleet deploy on docs/markdown/workflow-only pushes
  to main — add docs/**, **/*.md and .github/** to paths-ignore.
- Drop the dangling `update-freebsd` Make target (playbook doesn't exist;
  fleet has no FreeBSD hosts).
2026-06-04 18:37:37 +01:00
.github fix: stop masking failed service deploys; trim dead config 2026-06-04 18:37:37 +01:00
ansible fix: stop masking failed service deploys; trim dead config 2026-06-04 18:37:37 +01:00
docs fix: Documentation overhaul (#112) 2026-05-19 18:49:21 +01:00
terraform ci: serialize terraform and deploy runs with concurrency guards (#114) 2026-06-02 19:39:13 +01:00
.gitignore update readme 2026-03-28 21:06:14 +00:00
.sops.yaml initial commit 2026-03-28 12:39:41 +00:00
Makefile initial commit 2026-03-28 12:39:41 +00:00
README.md fix: Documentation overhaul (#112) 2026-05-19 18:49:21 +01:00

pez-infra

Infrastructure-as-code monorepo for managing my homelab and cloud server fleet. It contains everything needed to rebuild, configure, and maintain the entire infrastructure from scratch — including server provisioning, service deployment, DNS, monitoring, and secrets management.

What's in this repo

  • Ansible — Playbooks, roles, and inventory for configuring servers, deploying Docker-based services, and managing dotfiles
  • Terraform — OpenTofu/Terraform configs for cloud resources (Hetzner Cloud, Cloudflare DNS, Grafana Cloud, PagerDuty)
  • Services — Docker Compose definitions and config files for each self-hosted service
  • Documentation — Architecture decisions, networking topology, and operational guides

Architecture Overview

graph TD
    CF[Cloudflare<br/>DNS + CDN] --> HEL[helsinki-a<br/>Caddy proxy + SSO<br/><i>Hetzner Cloud</i>]
    HEL --> TS{Tailscale mesh}
    TS --> LB[london-b<br/>Storage, media<br/>Docker + systemd]
    TS --> LA[london-a<br/>Proxmox VE hypervisor]
    TS --> LC[london-c<br/>Raspberry Pi<br/>Octopus Energy exporter]
    TS --> CA[copenhagen-a<br/>Gaming<br/>Minecraft, WoW MaNGOS]
    TS --> NUR[nuremberg-a<br/>Mail, poste.io]
    TS --> CC[copenhagen-c<br/>Raspberry Pi<br/>cloudflared, idle]
    TS -.-> GC[Grafana Cloud<br/>metrics, logs, traces]

Traffic enters via Cloudflare DNS, hits a Caddy reverse proxy on a Hetzner cloud instance, and is forwarded to backend services running on various hosts connected over a Tailscale mesh network. Authentication for protected services is handled by Authelia with an LLDAP backend. Observability is shipped from every host to Grafana Cloud via Grafana Alloy.

Hosts

Host Location OS Role
helsinki-a Hetzner Cloud (Helsinki) Debian 13 Reverse proxy (Caddy), SSO (Authelia + LLDAP), Bitwarden, Forgejo
london-b London Ubuntu 24.04 Primary storage (ZFS), media servers, *arr stack
london-a London Debian 13 / Proxmox VE Hypervisor (currently runs a Mac VM; platform for future VMs)
london-c London Debian 13 (Raspberry Pi) Octopus Energy exporter, edge utility box
nuremberg-a Hetzner Cloud (Nuremberg) Debian 13 Mail server (poste.io)
copenhagen-a Copenhagen Ubuntu 22.04 Gaming servers (Minecraft, WoW/MaNGOS)
copenhagen-c Copenhagen Debian 12 (Raspberry Pi) cloudflared tunnel, idle/available

Directory Structure

├── ansible/        # Ansible playbooks, roles, inventory, and all managed files
│   ├── roles/      # Ansible roles (caddy, docker, media_stack, proxmox_ve, etc.)
│   ├── services/   # Docker Compose definitions and service configs
│   ├── dotfiles/   # Shell config (fish, nvim, tmux, git, etc.)
│   ├── playbooks/  # One-off playbooks (updates, reboots, status)
│   └── scripts/    # Utility and maintenance scripts
├── terraform/      # Terraform/OpenTofu for Hetzner, Cloudflare, Grafana Cloud, PagerDuty
└── docs/           # Architecture, networking, services, monitoring, and per-host docs

Getting Started

Prerequisites

  • SSH access to hosts via Tailscale (all hosts SSH as root)
  • ansible for configuration management
  • tofu (OpenTofu) or terraform for infrastructure provisioning
  • sops + age for editing encrypted secrets

Usage

  1. Clone: git clone git@github.com:RWejlgaard/pez-infra.git
  2. Services: Each service has its own directory under ansible/services/ with a docker-compose.yml and config files
  3. Deploy: cd ansible && make deploy runs the unified deploy.yml against the whole fleet (or make deploy-host HOST=<name>)
  4. Infrastructure: Terraform configs in terraform/ manage Hetzner servers, Cloudflare DNS, Grafana Cloud, and PagerDuty

Secrets

Secrets are encrypted in-repo using SOPS + age. Encrypted files use .enc. in their extension (e.g. secrets.enc.yml). See Secrets Management for full setup and usage instructions.

Documentation

Detailed documentation lives in docs/:

  • Architecture — Network topology, traffic flow, design principles
  • Networking — Tailscale mesh, DNS flow, physical networking
  • Services — Complete service map with ports, auth, and deployment info
  • Monitoring — Grafana Cloud, Alloy, synthetic checks, PagerDuty
  • Hosts — Per-host detail (hardware, services, quirks)
  • Getting Started — How to work with this repo