I am trying to build a RAG Knowledge base using the file in my file system. There are multiple folders and sub folders containing different file types e.g. PDFs, Excels, PPTs etc. I like to recursively read all the files under a given folder path. Once N8N has access to these files, I like to tokenize these documents for my embedding model. Do we have a template that can recursively read all the files and do the file splitting?
I am using the n8n locally on my laptop.
Goals:
a) Read the files recursively from sub folders
b) Split the documents for the embedding model
c) apply meta data to each chunk