How to optimize Pinecone vector embedding process for 7k+ items

Hello everyone!

I’m running a workflow on my laptop using n8n in Docker Desktop to vectorize a large dataset of over 7,000 items. The workflow starts fine but becomes extremely slow and eventually gets stuck or hangs after processing around 1,000 items. I’ve tried batching and failed to make it work properly. Maybe this is the solution in the end.

My main goal is to process all 7k+ items efficiently without the workflow stalling. I’m looking for advice on how to optimize my workflow.

There is no specific error message. The n8n node becomes unresponsive, the numbers don’t update.

I’ve attached a snapshot of the problematic area in the workflow below,

Information on n8n setup

  • n8n version: 1.108.1,
  • Database (default: SQLite): default
  • n8n EXECUTIONS_PROCESS setting (default: own, main): (I’m not sure, I have not explicitly set it up)
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Docker Desktop
  • Operating system: Windows 11 Pro

hello @azratuni

Better to move all “heavy” functionality to the sub workflow. Like this

Thanks for replying @barn4k. I really appreciate it.

However, I don’t think making the marked section into sub-workflow would help. I was using Pinecone free tier, Gemini free tier so these obviously have pretty strict rate limits. This was the issue.

I did solve the problem somewhat by place 𝗟𝗼𝗼𝗽 𝗢𝘃𝗲𝗿 𝗜𝘁𝗲𝗺𝘀 and 𝗪𝗮𝗶𝘁 nodes. Takes a whole lot of time tho…to facilitate the free tier rate limits.

Let me know your thoughts on this.

PS: Gemini just didn’t work in the end. Hence, Mistral embedding

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.