Blog

Lernreise 5/7: Day 3: Fifty Nodes and a Burning Budget

15 March 2026 · 5 min read · Stefan Pauleweit

lernreise n8n ai debugging

Lernreise 5/7: Day 3: Fifty Nodes and a Burning Budget

By day three, the main workflow had fifty-two nodes.

I want you to sit with that number for a moment. Fifty-two nodes in n8n. Conditional branches, error handlers, HTTP request nodes, code nodes with JavaScript doing things that code nodes in a visual workflow tool were never meant to do. Sub-expressions referencing field names from nodes seventeen steps back. The canvas was a tangle of lines that looked, from a distance, like someone had dropped a bowl of spaghetti on a circuit board and decided to ship it.

The AI had built most of it. I had been letting it run, intervening only to correct errors and run tests. This is the mode I had wanted: AI does the construction, I do the architecture and review. The theory was sound.

The practice was that the token consumption was extraordinary. Every time I fed the full workflow state into context for debugging, it ate tokens. Every correction required re-reading the entire structure. Every test failure triggered another round of analysis.

Then the Anthropic Pro Plan said, politely but firmly: that is enough for this week.

Not those words exactly. But the rate limiting was clear. I had hit the ceiling, and the ceiling was not going to move until the following week.

I topped up the budget. Not ideal for a fifty-euro experiment, but I was too far in to stop. And I had learned something from the spending limit that I should have learned earlier: this workflow was not manageable at this size.

The fix was obvious once I accepted it. Divide and conquer. One workflow of fifty-plus nodes becomes three workflows of fifteen to twenty nodes each:

A collection workflow. Fetches documents from Paperless, runs OCR if needed, produces clean text.
A RAG workflow. Embeds the text, queries ChromaDB, makes the similarity decision, calls the LLM with appropriate context.
A patch workflow. Takes the metadata decisions and writes them back to Paperless.

Three workflows, each manageable. Each testable in isolation. Each small enough that the AI could hold the entire structure in context without getting confused.

The confusion, when it happened, was spectacular.

At a certain point, with enough complexity in context and enough back-and-forth correction, the model started losing track of its own previous reasoning. It would propose a fix that contradicted a fix it had suggested four messages earlier. It would re-introduce an error it had just resolved. The context window, handling a large workflow state plus extensive conversation history, had simply run out of room. The solution was the same as for the workflow: decompose.

With the three-workflow architecture in place, progress resumed. By day five, the main workflow was running. New documents were being processed. The OCR, the RAG query, the LLM call, the metadata patch: all working, in sequence, on real documents.

Then I tried to retag the existing 850 documents.

The workflow fired all 850 requests against Paperless in rapid succession. Paperless, which is running in an LXC container on a MiniPC with an N150 processor, did not enjoy this. It responded in the way that small services respond to machine-gun API load: it fell over. Quietly. The container was still up. The API just stopped responding meaningfully.

Adding rate limiting to the workflow took longer than it should have. Not because it is technically complex, but because n8n’s approach to rate limiting is, charitably, unintuitive. There are several ways to do it, none of them obviously correct, and the one that looked most sensible was not the one that worked.

The Paperless metadata was its own category of problem. Paperless uses integer IDs for document types, tags, and correspondents. It does not accept string names in the API. The AI, when asked to patch metadata, would return the human-readable names it had been given as context: “Invoice”, “Household”, “Deutsche Telekom”. The Paperless API wants 14, 7, and 23.

The AI guessed. Confidently, wrongly, every time.

Sorting out the ID resolution required reading the Paperless-NGX API documentation properly and building a lookup step into the workflow. That part got fixed.

Then there is the storage path. Paperless uses Jinja templates to determine where a document is physically filed. All the relevant Paperless metadata had been discussed. The AI simply forgot this part. Never touched it. The workflow was writing tags and types and correspondents and assuming the filing would take care of itself. It does not.

To retrofit Jinja template handling into the existing n8n architecture would be a week of work by itself, with substantial token cost on top. That is not a small fix. It is effectively a rebuild of the whole workflow. And because the storage path is missing, the human-in-the-loop step cannot be used either — documents flagged for review are not correctly filed, so there is nothing to review and correct in the expected place. The feedback loop back into ChromaDB cannot close. The system cannot learn.

This is why the project is not finished. The document sorting works. The metadata extraction works. The metadata filing is only partly done — tags, types, and correspondents are written correctly, but the storage path is not handled. And the feedback loop, while designed, cannot be used — without the storage path, documents are not filed where Paperless expects them, so the feedback has nothing meaningful to work with.

← Lernreise 4/7: The Grand Plan: RAG, Vectors, and a 7-Cent Bargain · Lernreise 6/7: What AI Actually Can (and Cannot) Do →

Lernreise 5/7 of 7. Follow the lernreise tag for the full series.