Stefan Pauleweit

Notes from between the keyboard and the chair.

A personal blog on learning journeys, technical craft, and working with AI. Honest accounts from the middle of the work, not the polished end of it.

3 min read

Anthropic Tightened the Limits and Called It a Feature

New blog posts have been thin on the ground here lately. The BuchhalterPython series has stalled. Both of those things have the same cause: Anthropic quietly made Claude Pro nearly useless for me, then told me that was the plan. What I Actually Use Claude Pro. €20 a month. Sonnet 4.6 only. No Opus. No 1M context sessions. I am the most basic paying customer they have. In February I started using the paid plan. I had no illusions — €20 is not a lot of money. Hitting session limits when working across three projects at once was expected. But I could work for three to four hours before a limit kicked in. That was enough to get real things done.

claude anthropic rant tools
5 min read

Building BuchhalterPython: Types, Tests, and a Hallucinated API Key (Part 4)

After Part 3 left four LXC containers running and the src/ directories empty, Phase 4 starts writing actual code. The first day of implementation produces two closed issues, one rejected framework, and one bug that had already been caught once. Part 4 of the Building BuchhalterPython series. Part 3 covers provisioning four LXC containers with OpenTofu and Ansible. The Bug That Appeared Twice Issue #12 had already documented a type error in the Pydantic model for the Paperless API: tags: list[str] instead of list[int]. Paperless returns tag IDs as integers. The model had used strings. That bug was fixed, documented in llm/domain-knowledge/01-paperless-api-reference.md, and noted in ERROR_PATTERNS.md.

pydantic tdd python agentic-coding langchain microservices domain-knowledge testing
9 min read

Building BuchhalterPython: Provisioning Infrastructure (Part 3)

Part 2 was entirely headwork: microservice boundaries, RAG strategy, tag dimensions, storage path templates — all decisions made before a single line of code was written. Phase 3 is different. Phase 3 is the moment you press Apply and something moves on real hardware. Part 3 of the Building BuchhalterPython series. Part 1 covers agentic infrastructure and golden standards. Part 2 covers the five architectural decisions made before writing any business logic.

infrastructure opentofu ansible proxmox homelab agentic-coding celery redis
8 min read

Building BuchhalterPython: Architecture Before the First Commit (Part 2)

Part 2 of the BuchhalterPython series. Part 1 covers how we set up agentic infrastructure — six specialised agents, golden standards, and token optimisation — before writing any application code. We spent a full day on architecture before writing a single line of business logic. No features, no endpoints, no database schemas. Just decisions. By the end of that day, we had five architectural choices that each prevented at least one production failure. Three of those failures would have been silent.

agentic-ai architecture rag document-processing microservices
4 min read

Building BuchhalterPython: How We Set Up Agentic Infrastructure Before Writing Code

Before writing the first line of application code for BuchhalterPython, we invested a few hours designing and building agentic infrastructure. That’s one of the things AI fundamentally changes: what used to take weeks of scaffolding and deliberation now takes an afternoon. This wasn’t bureaucracy dressed up as planning. It was the difference between a project that ships and one that collapses under its own ambition. The temptation is always the same: start coding immediately. You know what you want to build. You’ve sketched the data model on a whiteboard. Why not just start? The honest answer: because infrastructure becomes infinitely more expensive to change after code depends on it.

agentic-ai infrastructure tdd microservices
5 min read

Lernreise 7/7: n8n, a Dead ThinkPad, and What's Next

Every project like this ends with a set of opinions you did not have before. Here are mine. n8n I will be charitable and say: n8n is excellent for linear workflows of three to five nodes. Trigger, action, done. For anything more complex, it becomes something I would describe, with some restraint, as binary toxic waste. Visual workflow tools have an inherent problem: the visual representation is the code. You cannot refactor it the way you refactor code. You cannot diff it sensibly. You cannot review it in a pull request. When a node is producing wrong output and you need to understand why, you are clicking through a canvas, unfolding nested expressions, reading JavaScript embedded in a UI field that was not designed to hold much JavaScript.

lernreise ai n8n lessons-learned homelab

Error between keyboard and chair.

EBKAC is a tongue-in-cheek nod to the classic support desk error code. This site documents what I learn, build, and break — in IT, AI, and everything adjacent.

About Stefan