Hey everyone,
After months of experimenting with AI agents in n8n, the hardest problem I kept hitting was memory — specifically making agents feel continuous across sessions without destroying performance or bloating every prompt.
Wanted to share the pattern I finally landed on that actually works in production.
The Problem with Single-Layer Memory
Most n8n AI agent tutorials use either:
-
The built-in memory buffer window (dies when session ends)
-
A Google Sheet or Airtable as memory (slow, no semantic search, doesn’t scale)
-
Just dumping everything into context (expensive, hits token limits fast)
None of these work well once your agent has weeks of history and needs to recall something specific from 3 conversations ago.
The Two-Layer Solution
Layer 1 — Postgres (Short-Term)
Use n8n’s native memoryPostgresChat node. Set a custom session key (I use username so each user gets their own history). Context window of 15 messages.
This loads instantly on every session. The agent always has the last 15 turns available without any extra calls. It handles “what did I just ask you?” perfectly.
Layer 2 — Supabase pgvector (Long-Term)
This is where it gets interesting. You need to:
-
Set up a
documentstable in Supabase with the pgvector extension enabled -
Create a
match_documentssimilarity search function -
Build a scheduled n8n pipeline that runs nightly
The nightly pipeline works like this:
Schedule trigger
→ Get last_vector_id from your init table
→ Query Postgres: SELECT * FROM n8n_chat_histories WHERE id > last_vector_id
→ Aggregate all messages into one block
→ LLM summarizer (I use Claude Haiku — cheap and fast)
→ Structured output parser (extracts summary field)
→ OpenAI embeddings node
→ Insert into Supabase vector store
→ Update last_vector_id so next run only processes new messages
Then on the agent side, expose the Supabase vector store as a retrieval tool — not a memory loader. The agent calls it when it needs to recall something specific. This way you’re not stuffing old embeddings into every single prompt unnecessarily.
Why This Works
Short-term (Postgres): handles in-session continuity, fast, no API calls Long-term (Supabase): handles cross-session recall, semantic search by meaning not keywords
The agent can answer “what did we decide about X three weeks ago?” because it searches by meaning across all past conversations — not just keyword matching in a database.
The Supabase SQL you need:
sql
create extension if not exists vector;
create table documents (
id bigserial primary key,
content text,
metadata jsonb,
embedding vector(1536)
);
create function match_documents (
query_embedding vector(1536),
match_count int default null,
filter jsonb default '{}'
) returns table (
id bigint,
content text,
metadata jsonb,
similarity float
)
language plpgsql
as $$
begin
return query
select
id, content, metadata,
1 - (documents.embedding <=> query_embedding) as similarity
from documents
where metadata @> filter
order by documents.embedding <=> query_embedding
limit match_count;
end;
$$;
One gotcha:
The last_vector_id tracking is critical. If your init table lookup fails and returns null, your pipeline will try to embed the entire history every single run. Always set a fallback to 0 and add error handling around the SQL query node.
Hope this helps someone — spent a long time getting this right.
I used this pattern inside a larger workflow I built (AI agent + lead gen + outreach + Gmail + Docs). If anyone’s curious about the full system I packaged it up and it’s available — check out here https://n8nclaw.vercel.app