cloudflared was retired in #56 when Caddy + Authelia replaced Cloudflare
Tunnels, but copenhagen-a was unreachable at the time so its
cloudflared.service was never stopped and is still running.
Add a cleanup task to the common role that stops, disables and purges
cloudflared wherever the unit lingers. Gated on the unit file existing so
it self-targets copenhagen-a and is a no-op everywhere else, and explicitly
excludes copenhagen-c, which legitimately runs a hand-configured tunnel.
Samba on london-b was allowed on 445/tcp from anywhere via UFW, exposing
SMB/CIFS to the public internet. Tailscale already reaches it through the
tailscale0 allow-all rule, so scope the explicit rule to the local London
LAN (192.168.1.0/24) instead of the world.
The common UFW task only ever adds allow rules, so it gained support for an
optional per-port from_ip, plus a follow-up task that deletes the superseded
world-open variant of any source-restricted port — otherwise the old
'445 ALLOW Anywhere' rule would linger on the host and defeat the change.
PESO-145
Bookshelf (PR #122) is a Readarr revival and now owns port 8787 on
london-b, so the old custom Readarr systemd unit is removed:
- drop readarr from the media_stack role's unit-deploy and enable loops,
and add an idempotent decommission task (stop, disable, remove unit)
so the host tears it down via Ansible rather than ad-hoc SSH
- delete services/readarr/readarr.service
- update docs (services, london-b host, service inventory) to describe
bookshelf as a Docker service instead of a custom systemd unit
The public readarr.pez.sh hostname is kept and now reverse-proxies to
bookshelf on :8787 — DNS, Caddy and Authelia (pez_readarr_users group)
are unchanged.
Bookshelf (a Readarr revival) for managing the ebook/audiobook library.
Runs on london-b with config at /root/bookshelf and the library at
/hdd/books mounted into the container at the same path.
The .terraform.lock.hcl was gitignored while providers use floating
~> constraints, so every CI 'tofu init' resolved provider versions
fresh and could drift from what was tested locally, with no checksum
verification on the providers.
Track the lock file instead, with hashes for linux_amd64 (CI) plus
darwin_arm64/amd64 (local). Dependabot's terraform updates now surface
exact provider version bumps as reviewable, hash-pinned changes.
The nightly job runs 'rclone sync', which permanently deletes or overwrites
objects at the B2 destination. That means an accidental deletion or a
ransomware encryption on /hdd propagates straight to the backup on the next
run, leaving no clean copy.
Add --backup-dir so every superseded version is moved into a dated folder
under _versions/ rather than thrown away, and prune that folder after 30
days so it doesn't grow unbounded.
The docker_services and systemd_services roles ran their "start the
service" tasks with `failed_when: false`, so a container or unit that
failed to come up still reported the deploy as green. Drop it from both
start tasks so a broken deploy actually fails CI. The compose/unit *copy*
tasks keep `failed_when: false` — that's load-bearing for the
`item is not failed` filter that skips services without a compose/unit file.
Also:
- Remove a duplicate "Template service .env files" task in docker_services
(second copy used a hardcoded path and didn't register; first one is the
one the start task reads).
- Don't trigger a full fleet deploy on docs/markdown/workflow-only pushes
to main — add docs/**, **/*.md and .github/** to paths-ignore.
- Drop the dangling `update-freebsd` Make target (playbook doesn't exist;
fleet has no FreeBSD hosts).
* chore: add dependabot config
Add Dependabot for the three supported ecosystems in this repo:
GitHub Actions, Terraform (root + grafana/hetzner/pagerduty modules),
and Docker (service compose files + dotfile Dockerfiles). Weekly
schedule with per-ecosystem grouping to keep PR noise down.
* ci: make terraform validation work on dependabot PRs
Dependabot PRs run with no access to repository secrets and a read-only
token, so the SOPS decrypt step (and the PR-comment step) fail. Give
Dependabot a secret-free path: stub the secrets.yaml keys it reads and
run init -backend=false + validate, skipping decrypt/plan/comment. Human
PRs are unchanged and still get a full plan.
Add Dependabot for the three supported ecosystems in this repo:
GitHub Actions, Terraform (root + grafana/hetzner/pagerduty modules),
and Docker (service compose files + dotfile Dockerfiles). Weekly
schedule with per-ecosystem grouping to keep PR noise down.
* ci: serialize infra runs and enable terraform state locking
Add concurrency guards to the terraform and deploy-on-merge workflows so
two merges in quick succession can't run against the same state or the
same hosts at once (queue, never cancel an in-flight run).
Enable native S3 state locking (use_lockfile) on the Backblaze B2 backend,
which needs OpenTofu 1.10+, so bump the CI tofu version 1.9.0 -> 1.10.10
and the required_version constraint to >= 1.10.0.
* ci: bump tofu to 1.10.10 in the validate workflow too
Missed this one in the last commit — the PR-time validate still pinned
1.9.0, which trips the new required_version >= 1.10.0 constraint.
* ci: drop use_lockfile — Backblaze B2 can't do native state locking
B2's S3 API returns 501 NotImplemented for the conditional PutObject that
use_lockfile relies on, so tofu plan/apply fails to acquire the lock.
Revert the lockfile and the 1.10 version bump it required; rely on the
concurrency guard to serialize applies instead. Left a note in the
backend block so this isn't re-attempted.
* Grafana Cloud migration, adding dashboards, fleet, alloy and synthetics
* modulize stuff now that we have multiple substantial things in here
* provider updates and new secrets
* remove grafana and prometheus from ansible