How to build a recursive Google Drive folder search and collect file IDs outside the loop?

Hi everyone,

I’m trying to build a workflow in n8n to prepare data for a RAG pipeline with Pinecone. The workflow needs to fetch HTML documents stored in Google Drive.

For one software package, all HTML files are in a single folder — that part works fine. But for another package, the documentation is spread across dozens of nested subfolders, and here’s where I’m stuck.

What I need:

  • Recursively go through a root folder and all its subfolders.

  • Collect the IDs of all HTML files I encounter.

  • Store these IDs in a list (or some storage outside of the loop), so that I can later pass them to a Google Drive → Download node.

The issues I’m facing:

  • I can iterate through the root folder and see the immediate subfolders, but I can’t manage to go deeper into nested subfolders.

  • Even when I try to process HTML files inside a loop, their IDs don’t get stored or aggregated outside the loop.

Has anyone built something similar? What’s the best way to configure the loops and data aggregation in n8n so that I can:

  1. Recursively search through all subfolders.

  2. Collect and output all HTML file IDs in one place for further processing.

Thanks in advance!

https://www.npmjs.com/package/n8n-nodes-google-drive-tree

1 Like

Hey @glambo_jambo hope all is good.

Below is something you could do to get your files. Sadly there is no native way to get all files at once, so you need to iterate the folders:

In the Edit field node we set the root folder ID and then we recurse and collect files.

1 Like

Holy Sh*t. Game-changer!

Yes this real game changer. Going to fork this and update to have by List and by URL resource selector for selecting the Google Drive node

Adding an additional mode selector on top

Output Mode:
-Tree (Folder/File Structure)
-File List (Simple JSON List of Folders)

Expect a PR in the next week!

1 Like

Excited :)))

I encountered the same issues last week and found this package - kudos. Google nodes are notoriously underpowered in N8N

Ended up forking your node and building upon it over a couple of redbulls this week to add a bit more functionality

Operations:
-Output as by File Tree (kept original structure)
-Output by File List (Array of Objects)
-Output by File List (Individual Items)

Added Resource Locators for Google Drive selection by File List or ID
Added Filters, added returning metadata properties, and

Just published 1.0.2, happy to send a PR

n8n-nodes-google-drive-recursive

The Beta releases are available for more of my personal build for my workflows but welcome people to try and send feedback

You can do this in n8n by using a Function or Function Item node to handle the recursion instead of trying to manage it with Loop nodes alone. The idea is to call the Google Drive “List” endpoint inside a function, check each item, push HTML file IDs into an array, and for every folder found, call the same function again until there are no more subfolders.

Once the recursive function finishes, return the full array so the next node (your Download node) gets a single list of all collected IDs. This avoids the problem of data getting trapped inside loops and lets you walk the entire folder tree cleanly.