How to avoid saving duplicates in Google Sheets?

Erich_F · May 10, 2025, 2:13pm

Hello everyone,

I’m scraping data from Apify, splitting it into items, and saving them to Google Sheets.

The problem: It saves all items again and again, even if they already exist.

I’m looking for a way to check if a row already exists and only add new entries.

I already tried using IF nodes and playing around with Remove Duplicates, but I’m a total beginner and couldn’t really get it to work the way I imagined.

Would really appreciate any tips or examples on how to set this up properly.

Best Erich

hubschrauber · May 10, 2025, 2:44pm

If there is a column you could use as a key to match, like an article ID or a URL, and assuming you wouldn’t lose information by updating/replacing a row with a “new scrape” of the same page/content, you could use the Google Sheets node’s Update Row operation instead of Append Row. If you want the same result without replacing a row, you’d still need a key column and use Google Sheets node’s Get Row[s] operation with a filter, then an If node, and then (if a match was not found) use the Google Sheets node’s Append Row operation.

Erich_F · May 10, 2025, 7:19pm

Could you help me set this up exactly as you described, with the Get Rows node, an If node, and appending only if no match was found? I wasn’t able to get it working. Thanks!

hubschrauber · May 10, 2025, 9:43pm

I was just trying to give you enough of an idea to start with so you could figure it out. I would recommend you go through the courses to become familiar with how things work in n8n. You won’t always get someone to figure things out for you.

Here’s one way to make it work which requires a merge node to carry forward the input items that were not found in the existing sheet.

The “test” sheet looks like this to start:

Erich_F · May 11, 2025, 3:49pm

Ty for your help, my workflow is working now!

system · May 18, 2025, 3:50pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.