RAG bot Pinecone Vector store Throttled?

Describe the problem/error/question

Hi all! I’m building a RAG chatbot that takes historical sales data and answers questions about it. I’ve made RAG chatbots in the past successfully, and I think the issue is the file size being uploaded or the chunks, but haven’t been able to pinpoint it after some troubleshooting. The main issue is with the actual uploading of the .csv(s) to my Pinecone vector store itself. The incoming data comes via a .csv file from my Google Drive, hooked up to the vector store with the default data loader and recursive text splitter (attached). It does start uploading the file, however after iterating/upserting a bit (has happened anywhere between the 200th iteration and 2000th iteration on my text splitter node, usually), it starts to get throttled hard. Goes from 200+ in half a second, to taking 5-10 seconds for a single record to upload. My .csv file itself is only about 5mb total after I trimmed it down due to my n8n instance crashing on the original file of ~11mb, so I’m stumped as to how to prevent this sudden throttling to happen. I would consider breaking down the files manually to upload it in smaller portions, but I have about 12 total .csv files I’m needing to upload, each containing at least 12,000 items/rows (when displayed through excel), with each row containing 231 columns worth of data, so doing it manually like that isn’t feasible. Any tips on how to get it to stop throttling during the upsert? Thank you!

What is the error message (if any)?

N/A - File does upsert, but hoo boy it suddenly trips and takes forever for each record at some (seemingly random) spot. Also I’ve removed sensitive data from the workflow so in the example below it’ll probably say that I’m not using the right API key and that the vector database etc doesn’t exist - those are just for the example. The workflow itself works fine, to a point.

Please share your workflow

Share the output returned by the last node

N/A

Information on your n8n setup

  • n8n version: 1.112.5
  • Database (default: SQLite): Default
  • n8n EXECUTIONS_PROCESS setting (default: own, main): Default
  • Running n8n via (Docker, npm, n8n cloud, desktop app): n8n Cloud
  • Operating system: Windows 11

You are definitely hitting Pinecone’s rate limits. I would suggest using Split in batches node to process a max of 100 items at a time, then use a wait node to wait 2-3 seconds before the next loop.

1 Like

Hey @ColtonJ504 hope all is good.

I don’t think RAG is the way to go about this task. RAG is good when you have fuzzy, unstructured text (“what does this policy say?”). Your data, on the opposite side, are large, structured tables where users will probably ask for filters, joins, and aggregations. An embedding-search over CSV rows won’t reliably compute “top 10”, sums, or multi-file joins - and you’d waste money embedding millions of cells.

What you want is LLM + tools, where your tools would do “tasks” users usually need.

Vectorizing over 30M cells is just not going to give you the results you are probably after. What you are looking for is a Database + LLM pair.

2 Likes

Thank you both! Yes, it’s definitely the pinecone rate limits. Since the data comes largely from the individual files containing loads of data in each one, I’m not able to split in batches - I assumed since the .csv contains a mix of hard data and unstructured, a vector database would be the way to go, but switching to a main structured database does more closely fit. I appreciate the help!