Hi everyone,
I’m working on a RAG setup using n8n with PostgreSQL + PGVector and facing a few issues:
1. Performance & token usage
Even with low Top-K (e.g. 10), response times are very high (minutes) and token usage becomes extremely large.
- How can I reduce token usage and ensure only relevant context is passed?
2. Top-K / Limit behavior
It seems like the Top-K setting just limits the number of rows (e.g. 10, 20, 30).
-
If relevant data is outside this range, it won’t be found.
-
It feels like the system is not ranking by similarity but just limiting results.
How exactly does Top-K work in PGVector? Should it always return the most similar results?
3. Same results for different queries
No matter what I ask, I often get the same Top-K results.
-
What could cause identical results for different queries?
-
Could this be a query embedding or workflow issue?
Thanks for any help.