Vector Database Optimization with n8n: Metadata, Text Splitting, & Embeddings

Nate Herk gives an excellent overview of different chunking strategies using n8n:

3 Likes

What would be great is a better understanding of how overlap effects recall, and which options to use for which use cases.

Example, let’s say I want to load a 300 page book to be questioned and used to give answers. What chunk size or overlap is best? Can the agent give responses that have considered the different angles presented in the book?

what about 20 books? Can the agent cross reference them as if it has absorbed the knowledge of 20 books?

Another example is technical specs and info. can we load 1000’s of SKU’s and the agent can give accurate info on product specs? Thanks for any insight this is really powerful stuff.

2 Likes