Feature Request: Add configurable memory persistence policies so AI Agents store only relevant long-term context instead of every intermediate tool call and execution message, reducing token usage and context bloat.

Feature Request: Configurable Memory Policies for AI Agent (Temporary Execution Memory vs Persistent Memory)

Hi n8n team,

While building a Data Analyst AI Agent with the native AI Agent node, I noticed a limitation in how memory is currently managed. I believe addressing this would significantly reduce token usage, improve scalability, and give users much finer control over conversational memory.


Current Behavior

Suppose I ask the agent:

“Inspect the database schema.”

The agent performs multiple reasoning steps:

User
   │
   ▼
LLM
   │
   ▼
SQL Tool (List tables)
   │
   ▼
LLM
   │
   ▼
SQL Tool (Inspect customers)
   │
   ▼
LLM
   │
   ▼
SQL Tool (Inspect orders)
   │
   ▼
...
   │
   ▼
Final Answer

This is expected and works well.

However, the Memory node stores the entire execution history, including every intermediate tool interaction.

Example of what gets stored:

System:
You are an expert PostgreSQL Data Analyst.

User:
Inspect schema.

Assistant:
Calling SQL Tool

Tool:
SELECT ...

Assistant:
Calling SQL Tool

Tool:
SELECT ...

Assistant:
Calling SQL Tool

Tool:
SELECT ...

Assistant:
Calling SQL Tool

Tool:
...

Assistant:
Final response...


The Problem

This execution history is then reused for future conversations.

As a result:

  • Every SQL tool call is stored.

  • Every tool output is stored.

  • Intermediate execution details become permanent conversation history.

  • Context size grows rapidly.

  • Token usage increases unnecessarily.

  • Long-running agents become increasingly inefficient.

In my case, a single user prompt resulted in approximately 7 LLM invocations, and the entire execution chain was written into memory.


Proposed Solution

Separate Execution Memory from Persistent Memory.

Execution Memory (Temporary)

During one agent execution:

User
      │
      ▼
Temporary Execution Memory
      │
      ▼
LLM ↔ Tool ↔ LLM ↔ Tool ↔ LLM

This memory exists only while the workflow is running.

It contains:

  • System prompt

  • User message

  • Tool calls

  • Tool outputs

  • Intermediate reasoning

  • Final answer

Once execution finishes, this temporary memory can be discarded.


Persistent Memory

Instead of automatically saving the entire execution, n8n should apply a configurable Memory Policy before writing anything permanently.

For example:

Temporary Execution Memory
            │
            ▼
     Memory Policy
            │
            ▼
Persistent Memory


Suggested Memory Policies

1. Store Everything (Current Behavior)

For users who prefer the existing behavior.


2. Store Only User + Final Assistant Messages

Store:

User:
Inspect schema.

Assistant:
The database contains four tables...

Ignore all intermediate tool interactions.

No additional LLM call required.


3. Ignore Tool Calls

Automatically remove:

  • assistant tool calls

  • tool messages

Keep only the conversational exchange.

No additional LLM call required.


4. Ignore Tool Outputs

Useful when tools return large payloads (SQL results, RAG documents, API responses).

No additional LLM call required.


5. AI Summary (Optional)

Some users may prefer long-term summarized memory.

Example:

Conversation Summary

- Database schema explored.
- Four tables identified.
- User is analyzing ecommerce sales.

This would require one additional LLM call and should be optional.


6. Custom Memory Processor Node

This would be the most flexible solution.

Example:

Temporary Execution Memory
            │
            ▼
Memory Processor Node
            │
            ▼
Persistent Memory

Users could implement:

  • JavaScript filtering

  • Python processing

  • AI summarization

  • Regex extraction

  • Custom business logic

before writing to memory.

This would make memory behavior fully customizable without changing the AI Agent itself.


Why This Matters

Many real-world agents perform dozens of tool calls:

  • SQL

  • RAG

  • MCP servers

  • REST APIs

  • File processing

  • Code execution

These intermediate execution steps are necessary to solve the current task, but they are rarely useful as long-term conversational memory.

Separating temporary execution memory from persistent conversational memory would:

  • Reduce token consumption

  • Improve performance

  • Keep memory relevant

  • Scale better for complex agents

  • Give developers full control over memory retention


Questions

  1. Is the current behavior (persisting the entire execution history) intentional?

  2. Are there any plans to support configurable memory policies?

  3. Would the maintainers consider separating temporary execution memory from persistent memory as described above?

I believe this would be a valuable enhancement for anyone building production AI agents with extensive tool usage.

Thanks for considering this feature request!