Loop Over Items can't handle multiple incoming lists?

I really have tried, but whenever I run into a situation where I need to use a Loop Over Items I always fail to achieve the desired behavior. Can anyone help me understand how to solve the next problem?

I am trying to make a workflow capable of downloading files via FTP from a NAS server. Given the list of files (their metadata) to fetch, I want it to fetch the contents and append them to each file’s metadata.

All the fetching works great and all file metadata is properly cleaned. The problem arises when I try to iterate over multiple incoming lists of files. Because of the nature of how the files lists are generated, they must be as separate batches. I am also forced to use the Loop Over Items node because I want to process each file data later and I must do it one by one.

The problem is this: when multiple lists of files arrive to the Loop Over Items node, it only loops over the first list, and pipes the rest directly through the ‘done’ branch. Yes, I have tried to use the ‘reset’ option, but it just makes it only process the first item of each list, and yes, I have tried running the complete workflow (not just node by node), so I have resorted to the last resource I have: posting this. I am helpless.

This is an illustrative example:


As you can see, only one item is processed (first incoming list only has one item).

Inspecting the node we can see that only the first list is processed:

image

{E129AA9D-2F73-4DFE-BC2D-408B32686EB7}

{748D3610-4996-43D5-AC19-D6037E956730}
And so on with the rest of runs.

It looks like if after the first run it marks itself as ‘done’ and then ignores the rest of inputs. Is this normal behavior? How do I solve this? I just want to get each file as one individual item for all runs.

1 Like

Hi @mriioos, Is it possible to share your workflow in a code block? My wild guess is that you are not completing the loop by linking your last node in the loop back to the loop

For example you need to close the loop:

Have you been able to fix it

Hi there!

Sorry for not responding earlier to this thread. With christmas and new year I have been AFK.

Yes, I haven’t closed the loop, because when I do so, it enters an infinite loop and crashes the server.

This is my entire workflow.

Hi @mriioos,

In that case you’re using the Loop node completely wrong and your logic seems flawed. Can you explain what your use case is in more detail so we can help you fix the workflow. The Loop node should never go into infinity if it is used correctly as it takes in a list of items and can only loop so many times. If it’s going into infinity, then there is a problem with your logic.

Not completing the loop is the reason you’re only processing the first item

Oh, okay.

My use case is building a RAG system capable of indexing multiple data sources.
In this case, I am indexing files from a NAS server by recursively crawling its directories. This process produces batches (I call batch to a list of items in n8n) that contain each file’s metadata.

The problem appears when I try to read the contents of those files in order to generate vector embeddings. If I attempt to load the contents of every file in a batch at once, memory usage explodes.

That is why I tried using a loop: to process one item at a time, ensuring that only a single file’s contents are held in memory at any given moment.

I really have tried all configurations I could think of with the for loop, but I don’t wrap my head around what I am doing wrong as the logic seems pretty simple to me.

Also, sorry if my english isn’t englishing, it is not my main language :slight_smile:.

Ok great! In that case you were on the right track. Here is an example of a workflow I use for pushing resumes from a google storage bucket into a Supabase based vector store for RAG purposes.

Essentially I get employee records from firestore, which contains a document download link (the public storage bucket link to download), I then Loop over each record I have (they all have a download link) and download the file via a http GET request (in your case this is where you pull one document at a time from your source). Once the file is downloaded I build my metadata and insert the document content into my vector store, rinse and repeat. It is important you close the loop and set your batch size to 1, else the workflow will attempt to download multiple files which will cause memory problems depending on the size of the documents.

See example below:

Well, I was wrong about the infinite loop (after the many tries I have given it, I may have mixed my memories). I am not sure if this is an UI feedback design problem but, as you can see, it only shows that one item has been processed in the loop (the first one), but 205 items have been outputted through the ‘done’ branch (where only the first item has its contents), which makes me think that either the loop node is marking itself as ‘done’ after the first item and ignoring any other incoming items, or that the UI isn’t giving me clear hints on what is happening or that I am missing something. Maybe the ‘Execute Workflow’ button doesn’t allow for more than one loop?

Thanks for the fast responses by the way :).

Please share this above workflow in a code block and pin the data in the code block just before the loop so I can see what is happening. What is the batch site you put on the Loop node? It should be 1. I suspect you are wither doing something weird in the code node or you set a very high batch side causing the loop to only run 1 batch

Here is the pinned execution.

However, I clicked the ‘pin’ option but it only pins one item, I am not sure how to pin more.

Also, the batch size is set to 1 and the reset option is turned off.

Make sure you run a test where the node has the 60 odd items and then pin. First unpin the current one.

I had to disable a few nodes which I dont have like the NAS node etc, it loops 7 times if I have 7 items in the pinned node. Im wondering if one of the node errors out maybe on your end?

Are you running the full execution or did you click the play button on a specific node?

I am running the full execution, but I don’t see any error appearing. I have also noticed that, on each run recived by the ‘No operation, do nothing’ node on the ‘done’ branch of the loop node, the list of items accumulates the items of the previous execution. That is why 205 items appear in the ‘done’ branch when only 65 are being inputed. I am not sure if this helps but I felt like I should share it.

If you can pin the data you’re testing with it will help. Im wondering if there is an issue in one of the nodes I disabled.

Maybe try disable the same nodes so we can try and trace the problem. Disabled the below highlighted nodes

Okay, I have now tried to execute it with the same nodes deactivated, and with all nodes deactivated, but got the same result:

Also I have tried pinning the data, but I am not sure why on download it only pins one item, I am going to paste here a few of the items that use as a test input into the loop node. As you will see, each item represents the metadata of a file, retrieved from a source (in this case the nas server). This metadata is then used to determine whether this file needs to be updated in the vector database or not. Metadata may include previously stored (in other database) data about the file if it has been vectorized before (to compare new metadata with old metadata). The loop node prevents multiple files being loaded into memory at once.

Batch 1:
[

{

“type”: “-”,

“source”: “nas”,

“name”: “Neighborhood War Games Simulation.pdf”,

“new_size”: 182515,

“new_updated_at”: 1764892800,

“path”: “/miguel/Neighborhood War Games Simulation.pdf”,

“path_hash”: “2088cd0a7d88cd0a”

}

]

Batch 2:
[

{

“type”: “-”,

“source”: “nas”,

“name”: “CV_clean_old.mhtml”,

“new_size”: 507466,

“new_updated_at”: 1758153600,

“path”: “/miguel/CV/CV_clean_old.mhtml”,

“path_hash”: “cc2e541de92e541d”,

“file_id”: “a683f5b2-1a3e-484a-8396-fbc201adeded”,

“size”: 507466,

“hash”: “cc2e541de92e541d”,

“updated_at”: 1758230000,

“deleted_at”: null,

“synced_at”: 1758230000

},

{

“type”: “-”,

“source”: “nas”,

“name”: “Miguel Ríos Marcos - English.pdf”,

“new_size”: 2847152,

“new_updated_at”: 1764028800,

“path”: “/miguel/CV/Miguel Ríos Marcos - English.pdf”,

“path_hash”: “e296b326b796b326”

},

{

“type”: “-”,

“source”: “nas”,

“name”: “Miguel Ríos Marcos - English_compressed.pdf”,

“new_size”: 1726519,

“new_updated_at”: 1764028800,

“path”: “/miguel/CV/Miguel Ríos Marcos - English_compressed.pdf”,

“path_hash”: “5e9970cea59970ce”

},

{

“type”: “-”,

“source”: “nas”,

“name”: “Miguel Ríos Marcos.pdf”,

“new_size”: 3085467,

“new_updated_at”: 1764028800,

“path”: “/miguel/CV/Miguel Ríos Marcos.pdf”,

“path_hash”: “a04db21d8d4db21d”

},

{

“type”: “-”,

“source”: “nas”,

“name”: “vectors.ipynb”,

“new_size”: 13453,

“new_updated_at”: 1767730860,

“path”: “/miguel/QComputing/vectors.ipynb”,

“path_hash”: “22d4c3e0a1d4c3e0”

},

{

“type”: “-”,

“source”: “nas”,

“name”: “MAIS1.zip”,

“new_size”: 282184479,

“new_updated_at”: 1761782400,

“path”: “/miguel/Uni Archive/MAIS1.zip”,

“path_hash”: “31a52e294ea52e29”

},

{

“type”: “-”,

“source”: “nas”,

“name”: “MAIS2.zip”,

“new_size”: 1430939736,

“new_updated_at”: 1761782400,

“path”: “/miguel/Uni Archive/MAIS2.zip”,

“path_hash”: “63fcadbc80fcadbc”

},

{

“type”: “-”,

“source”: “nas”,

“name”: “MAIS3.zip”,

“new_size”: 2227660220,

“new_updated_at”: 1761782400,

“path”: “/miguel/Uni Archive/MAIS3.zip”,

“path_hash”: "2c69e8af4969e8af”

},

{

“type”: “-”,

“source”: “nas”,

“name”: “U-Stickers.zip”,

“new_size”: 1936,

“new_updated_at”: 1761782400,

“path”: “/miguel/Uni Archive/U-Stickers.zip”,

“path_hash”: "5f1dbf40761dbf40”

}

]

Definitely something weird happening with your flow. On my side it works 100% when I use your data.

Remind me again what version of n8n you’re using?

My next suggestion is to copy the nodes into a new workflow and test again. Not sure if something is corrupt on that version.

After that you can also try deleting the loop node only and then re-adding it to make sure the node itself is recreated

However, with the sample data you gave, what do you mean by two batches?

Your first batch only has one item.

When I say two batches I mean two lists of items. You know how in n8n items are passed as a list to the next node. When I crawl the NAS directories, each directory extracts the metadata of all files, thus creating a list of items. This happens recursively, producing for each directory another list (that is what I call “batch of items” because it feels confusing to me calling it list as they are supposedly independent items).

I’m in version 2.1.2. Also, I have noticed that, for 9 input items, your loop executes 10 times, but when I input 65 items, it only executes 7 times. I think that is because the input was 6 distinct lists of items, I don’t know if this is useful.

This is what I mean by ‘batch’ or ‘distinct input lists of items’:

I am not sure if this helps.

Also, I have tried copy-pasting nodes into a new workflow and I got the same result. Also, I tried replacing the loop node with a fresh one, and still got the same result. Maybe I should try to isolate the problem and see if I can replicate this behaviour in a simpler workflow. It works with one list, but not with many. I don’t know if this is intended design, but if so, I don’t understand how I could do this. Maybe making a workflow that is capable of acumulating batches of items? Like the inverse of split into batches (Loop node) or something. I’ll try.

@mriioos is this the workflow?

The images yes, the JSON is some of my test items.