Every time my n8n workflow runs with an AI agent, it starts completely fresh.
If the agent makes the same mistake twice, I have to fix it manually both times.
Plus, those “wait X seconds” timer nodes are brittle — sometimes they timeout,
sometimes they’re too slow.
The Solution
I built AgentMem — a single Python file that:
Reads agent’s past mistakes before every run
Agent learns from history automatically
Stops repeating the same errors
Replaces timers with an event bridge
Next step fires the EXACT instant agent finishes
No guessing. No timeouts. No crashes.
Sends Gmail alerts when things break
Full attempt history + what went wrong
You know immediately instead of discovering at 2am
How it works with n8n
Just 3 nodes:
HTTP Request → POST to agentmem
Webhook node (replaces your timer)
Your workflow continues
Cost
$0 extra. No database. No cloud service. One Python file.
The event bridge pattern to replace timers is the part I find most interesting - timer nodes in n8n are genuinely unreliable under load, and using a webhook trigger as a callback from the Python process is a much cleaner handoff. The mistake memory layer on top of that is clever: injecting past failure context into the agent’s system prompt before each run is a simple pattern that could significantly reduce repetitive errors without needing a separate vector DB. Did you try this with the LangChain-based AI Agent node in n8n, or are you hitting the HTTP Request node directly?
Appreciate this a lot — and thanks for taking the time to review my work this deeply.
The event bridge part was actually the core thing I wanted to solve because timer-based workflows in n8n felt very fragile under load.
The memory layer is still early and honestly not fully benchmarked yet — right now I’m directly hitting the HTTP Request node instead of doing deep LangChain integration. I wanted to keep the architecture lightweight first before adding more abstraction layers.
Still experimenting with how far the “failure-memory + callback orchestration” approach can go before needing embeddings/vector retrieval.
Yeah, that’s actually a really interesting way to look at it. I honestly wasn’t thinking in the CLI/orchestration direction while building this.
I was mainly trying to make something simpler and more cost-effective that could reduce repeated workflow errors and make automation outputs more reliable without using fragile timer-based flows.
The project is still pretty early and not fully tested yet — I’m still experimenting with the memory/recovery side and figuring out how far lightweight approaches can go before needing heavier systems like embeddings/vector retrieval.
Really appreciate you taking the time to check it out though, your comments genuinely gave me a few new ideas to think about.
Also yeah, one reason I went in this direction is because I’m a big fan of open-source, lightweight, and simple systems instead of very heavy stacks.
At one point I actually thought about adding another AI layer that would separately analyze errors and guide the main model, but I dropped the idea because it started becoming expensive, less lightweight, and unnecessarily complex for larger workflows/tasks.
I’m trying to keep the whole thing practical, simple to run, and easy for people to plug into existing workflows without needing a huge setup or emptying their pockets just to make automation more reliable.
That makes sense - adding a separate error-analysis layer would have created exactly the problem you were trying to solve. Keeping the failure context injection simple and direct is the right call at this stage. Good luck pushing it further.