Describe the problem/error/question
Hi all! I’m building a RAG chatbot that takes historical sales data and answers questions about it. I’ve made RAG chatbots in the past successfully, and I think the issue is the file size being uploaded or the chunks, but haven’t been able to pinpoint it after some troubleshooting. The main issue is with the actual uploading of the .csv(s) to my Pinecone vector store itself. The incoming data comes via a .csv file from my Google Drive, hooked up to the vector store with the default data loader and recursive text splitter (attached). It does start uploading the file, however after iterating/upserting a bit (has happened anywhere between the 200th iteration and 2000th iteration on my text splitter node, usually), it starts to get throttled hard. Goes from 200+ in half a second, to taking 5-10 seconds for a single record to upload. My .csv file itself is only about 5mb total after I trimmed it down due to my n8n instance crashing on the original file of ~11mb, so I’m stumped as to how to prevent this sudden throttling to happen. I would consider breaking down the files manually to upload it in smaller portions, but I have about 12 total .csv files I’m needing to upload, each containing at least 12,000 items/rows (when displayed through excel), with each row containing 231 columns worth of data, so doing it manually like that isn’t feasible. Any tips on how to get it to stop throttling during the upsert? Thank you!
What is the error message (if any)?
N/A - File does upsert, but hoo boy it suddenly trips and takes forever for each record at some (seemingly random) spot. Also I’ve removed sensitive data from the workflow so in the example below it’ll probably say that I’m not using the right API key and that the vector database etc doesn’t exist - those are just for the example. The workflow itself works fine, to a point.
Please share your workflow
Share the output returned by the last node
N/A
Information on your n8n setup
- n8n version: 1.112.5
- Database (default: SQLite): Default
- n8n EXECUTIONS_PROCESS setting (default: own, main): Default
- Running n8n via (Docker, npm, n8n cloud, desktop app): n8n Cloud
- Operating system: Windows 11