Comparing two datasets

Hi team, I’m looking to compare two datasets and find missing emails. Node 1 fetches clients and outputs their email address, this returns 158 results. Node 2 looks up the email addresses in Notion, this returns 120 results. I want to compare these two datasets to see which clients didn’t get found in Notion by comparing the email field for both datasets. In other words, which 38 clients didn’t get found in the Notion database.

I tried multiple things such as merge node and compare datasets, but they never output 38 results as you’d expect. In the screenshot you can see the merge node outputs a total of 94 items, which also includes emails that DID get found in Notion, so it’s not accurate. The settings in the merge node seem logical to me… compare the email fields and keep the non-matches, but it doesn’t work as I’d expect.

Any help would be much appreciated!

To find the missing emails, you should use the **Compare Datasets** node instead of the Merge node. Here’s how:

- Set Input A to the dataset from Node 1 (158 emails).

- Set Input B to the dataset from Node 2 (120 emails).

- Configure the node to match the email fields.

- Use the “In A only” output branch to get the 38 emails not found in Notion.

For more details, check out this [n8n.io workflow template](Comparing data with the Compare Datasets node | n8n workflow template) for a practical example.

I’m quite new to n8n and so far I found it very tricky to do something as simple as comparing two datasets and find missing data, especially if the data structure differs.
However, when I try your approach, it should work:

Also, the compare datasets node should do:

If that’s not working for you, my guess would be that there’s an issue with the data. Try to clean or strip that and check if it solves the issue.

Great explanation, you help me with a similar challenge :100: