Trying to figure out, how can I effectively combine the extracted pages into one single spreadsheet.
With the current setup, all the necessary pages from a site have been extracted, sorted, and filtered out, but the end result will be 21 (based on the page numbers) separated items.
Tried all kinds of approaches (merge, if, item lists, even wait), but the “last step” is simply out of my league.
Is there any way to put each “runs” into a temporary cache, and once it’s finished, combine all the outputs into one?
Obviously, I tried the “dumb” way (which works) without a loop and everything manually set, but this is laughable and stupid I know.
Appreciate any kind of help
Welcome to the community @smog!
You probably want to change your workflow slightly.
HTML Extract2 to the
false output of the
More Pages? node. Meaning it will only run once all pages have been fetched. Then look at this workflow and the node called
Take that node and put it between
More Pages? and
HTML Extract2 and change line 6 to:
const items = $items("Fetch Website", 0, counter).map(item => item.json);
Then you should be either totally there or at least to 98%.
Thank you for the quick help.
So it should look like this right?
The merging part looks “ok” now, but the HTML Extract2 part keeps crashing.
Also, not sure if this can be an issue or not, but the Merge Data you referred to gives me an info box:
Hi @smog, I am sorry you’re having trouble. Can you narrow down the issue a bit and provide a simplified example workflow using which your current problem can be reproduced? Thanks so much!
As for the merge logic, you can try an approach using the new code node if you like: Merge multiple runs into one | n8n workflow template. This should make the hint disappear.
Hi MutedJam, thank you for your help.
Now with both Merge Loop run without any specific error message, but without any output items.
Tried jan’s and your method, but no success.
With the splitinbatches-advanced I am able to combine at least the last run too but it is also spitting out the previous too for conversion/extraction, which are obviously unnecessary. At least that works, but suboptimal.
Here is the workflow of the current setup (which is not working)
And here is the “working” but sub-optimal version, where at least the final file is good, but many unnecessary processes.
due to an external node here is a screenshot of the flow too:
Of course, if anyone has a much more elegant solution (with only 1 final file as result) it is more than welcome
Hi there ,
since you are already using a community node. You can also use another community node Ive created. Called the iterator. It can be an annoying one to configure but should do what you are looking for I guess.
That is looking incredibly promising, thanks for letting me know.