Part 2 was entirely headwork: microservice boundaries, RAG strategy, tag dimensions, storage path templates — all decisions made before a single line of code was written. Phase 3 is different. Phase 3 is the moment you press Apply and something moves on real hardware.
Part 3 of the Building BuchhalterPython series. Part 1 covers agentic infrastructure and golden standards. Part 2 covers the five architectural decisions made before writing any business logic.
Concretely: four new LXC containers on a Proxmox host, provisioned via OpenTofu and configured via Ansible. Redis as the Celery broker, a Harvester with FastAPI and Celery, Classifier and Patcher as pure workers. Sounds like two hours of work. It was more.
Here is what actually happened.
The Good Decisions
Redis on Alpine
Redis gets Alpine 3.23. Not Debian, not Ubuntu.
- Alpine image: roughly 5 MB. Debian image: roughly 120 MB.
- Alpine in LXC without overhead: 256 MB RAM, 4 GB NVMe disk. Done.
- Attack surface correspondingly small.
- Redis 7.x is a first-class citizen in Alpine’s package manager:
apk add redis.
Nothing is sacrificed. Everything is gained. The decision took thirty seconds.
Storage: NVMe for the Broker, Slow Storage for Everything Else
This decision looks petty at first glance and sensible at the second.
Redis needs IOPS. It is an in-memory store with optional RDB snapshots. Every Celery task reads and writes to the queue. That is exactly the workload NVMe was built for.
The Python services — Harvester, Classifier, Patcher — are RAM-bound and network-bound. They write code files once at deployment. They read them at startup. After that, every relevant operation is an HTTP call outbound (Paperless API, Mistral API, ChromaDB) or Celery queue communication over Redis. The disk is nearly irrelevant.
slow_storage is an old SATA SSD sitting in the same server alongside the NVMe — not fast, but perfectly adequate for workloads that are barely disk-bound. Routing the Python service containers there protects the NVMe from unnecessary write cycles. NVMe wear in a homelab is real, and nobody wants to rebuild a Proxmox host because a storage device wore out prematurely on workloads that would have run fine on a five-year-old SSD.
Celery and FastAPI in the Same Container
This is the decision that requires explanation — not because it is wrong, but because it looks counterintuitive.
Classical microservice logic says: separate everything. FastAPI in container A, Celery worker in container B.
The counter-question: why?
Horizontal scaling of the Harvester is not a planned goal. The Harvester API does exactly one thing: receive Paperless webhooks and push tasks into the Celery queue. That is a trivial FastAPI function. The real compute work sits in the Celery worker underneath it: OCR via Mistral, embedding, ChromaDB push.
Both in the same LXC container: one container, one process manager (supervisor), one virtual environment. No network hop between API and worker. No second deployment target. No additional monitoring surface.
Classifier and Patcher are different: they have no API. They only listen on their Celery queue. No FastAPI needed. So: pure worker containers, no uvicorn.
The principle is minimal separation where no real benefit from separation exists. Not separation for its own sake.
Naming Convention: bhp- as Project Namespace
The containers were initially planned with simple names: harvester, classifier, patcher, redis.
The problem is subtle and only reveals itself when more than one project runs on the same Proxmox host. These names are generic. On a homelab host with twelve containers, redis is a disaster as a name. Which Redis? For which project?
Three letters solve the problem: bhp-. BuchhalterPython. DNS-safe (no underscores), immediately recognisable, consistent throughout.
bhp-redis LXC 151
bhp-harvester LXC 152
bhp-classifier LXC 153
bhp-patcher LXC 154
This prefix appears consistently everywhere: the Proxmox UI, the Ansible inventory, SSH aliases, the Caddy reverse proxy configuration. No guesswork.
Naming conventions are not bureaucracy. They are infrastructure for human brains.
The Hallucinated Anthropic Key
The Infra Agent generated Ansible Vault templates. Correct structure, correct Ansible syntax, correct Vault variables. And then, in the middle of it:
# VAULT — do not check in as plaintext
redis_password: "..."
mistral_api_key: "..."
anthropic_api_key: "..." # ← not in this project. not planned. does not exist.
paperless_api_token: "..."
anthropic_api_key. In a project that uses Mistral exclusively. I had provided the full project documentation as context. The tech stack was documented. The agent still added a key for a service I am not using.
What happened: the agent pattern-matched. AI project plus Vault template equals, in its training distribution, something that frequently involves Anthropic. So it added it. Automatically. Without checking the context I had explicitly provided.
This sounds like a small mistake. It is a large, recurring pattern.
AI agents do not only hallucinate facts. They fall back into default patterns even when the correct answer is in the context window.
That is the uncomfortable part. The documentation was there. The constraints were implicit in it. The agent ignored them and generated what it expected a project like this to look like — not what this project actually is. I caught it on review. But I should not have had to: the information needed to get this right was available.
The Lernreise series covers this same failure mode from a different angle: the agent is not reasoning. It is interpolating. The difference matters when you are building something real.
The workaround: add an explicit directive to the agent specification itself. Not general project documentation. A direct line that cannot be pattern-matched away:
# AI platform: Mistral (OCR, Embedding, LLM)
MISTRAL_API_KEY={{ mistral_api_key }}
# No Anthropic key — this project uses Mistral exclusively
This is frustrating. It should not be necessary. The information was already there. But it is necessary, and pretending otherwise means the next template will have the same problem.
Operational Reality: DHCP and Other Surprises
The workflow for new containers on this Proxmox setup is: first DHCP, read the MAC address, reserve a static IP in the Fritz!Box, then update OpenTofu with the static IPs, then tofu apply again.
In theory: clean. In practice: DHCP assigned different IPs than planned.
The addresses .184 through .187 were intended, checked as free in the existing network inventory before the Apply. What was forgotten: the Fritz!Box has a DHCP pool that follows its own assignment logic. The desired IPs were free, but DHCP temporarily assigned different ones. The consequence: after the first Apply, note the MAC addresses, assign the correct IPs statically in the Fritz!Box interface, then tofu apply again with the static addresses.
This is not a bug in OpenTofu. It is the feature: DHCP-based bootstrapping is intentionally built so containers start with DHCP and then become static. But it requires a manual step in the Fritz!Box web interface that was neither automatable nor documented in the original plan.
Lesson: the Fritz!Box step belongs explicitly in the execution sequence, with a warning that DHCP addresses may differ from planned ones. This is exactly the kind of infrastructure-as-code subtlety that does not appear in tutorials.
The RAM Upgrade That Required a Reboot
dhost-slow — the VM running Drone CI/CD — had 1 GB RAM. For Python build jobs with pip install and pytest against a growing codebase: insufficient.
Upgrade to 1.5 GB.
The question beforehand: does this VM have balloon support? Balloon memory enables hot-plug — a RAM change without a reboot. Answer: no. balloon: 0 in the Proxmox configuration.
So: qm set 201 --memory 1536, then qm reboot 201. Drone CI/CD was offline for approximately thirty seconds. Nobody noticed because it happened at night. Still: this is exactly the moment that belongs in a rollback strategy.
Lesson: enable balloon support for new VMs from the start. One parameter in the Proxmox configuration prevents reboots for RAM changes. Worth it every time.
What Did Not Work with the PM Agent
The honest part.
Phase 3 issues were too large. “Provision all containers” as a single issue is big-bang thinking. What happens in practice: the agent (or the human) sees a large issue, starts on it, realises after thirty minutes that sub-steps are their own problems, and then splits the issue retrospectively — under pressure, with less clarity.
The correct pattern for Phase 4: atomic issues. One issue per container. One issue for the naming convention. One issue for the RAM upgrade. Each small enough to close in a single session, with clear acceptance criteria and no ambiguity.
This is the same lesson the Lernreise series taught about n8n workflows: big-bang complexity does not fail loudly. It accumulates quietly until it collapses. The discipline of small, closeable units of work is not a methodology preference. It is a survival strategy for agentic development.
Current State: Containers Running, Ansible Pending
192.168.177.184 → LXC 151: bhp-redis ✅ running
192.168.177.186 → LXC 152: bhp-harvester ✅ running
192.168.177.187 → LXC 153: bhp-classifier ✅ running
192.168.177.185 → LXC 154: bhp-patcher ✅ running
OpenTofu created the containers. Ansible provisioning — Redis configuration, Python virtual environments, supervisor setup, code deployment — is documented and waiting for execution.
This is not a backlog item. It is a principle: plan completely, document before executing, establish rollback strategy, then Apply.
What Phase 4 Will Do Differently
Three things.
Agent constraints will be more explicit. Not implicit through project documentation that an agent might overlook. A direct directive in each agent specification: “This project uses Mistral exclusively as its AI platform.” No ambiguity.
Issues will be atomic. No “provision all containers.” Instead: four issues, four PRs, four commits. Smaller blast radius when something goes wrong.
Balloon support will be enabled by default on new VMs. One parameter in the Proxmox configuration prevents future reboots for RAM changes. The cost is zero. The benefit is operational flexibility.
Takeaway
Working with AI on infrastructure is not difficult because AI is bad at infrastructure. It is difficult because AI confidently fills in blanks you did not know you left.
I provided full project documentation. The agent still added a key for a service I am not using. The DHCP plan was documented. OpenTofu still needed two passes because a manual step was not explicit enough. The PM agent issues were documented in detail. They were still too coarse to be useful.
In each case: the documentation existed. The agent did something adjacent to what was needed. Not wrong enough to be obviously wrong. Just wrong enough to need catching.
That is the development experience this series is actually documenting. Not “AI makes everything faster” — sometimes it does — but the specific, repeatable ways AI assistance introduces errors that require active human correction. You do not get to switch off. You get a faster tool that requires more careful oversight than the slower tool it replaces.
Next in this series: Types, Tests, and a Hallucinated API Key (Part 4) — Pydantic models, TDD workflow, and why we rejected LangChain.