How to find and keep duplicate records?

I have json of computers. There are duplicate records in my data.

What I want to do is match the duplicates and merge the data that should be merged.

How would I do that?

The remove dupes node just discards to dupes.

Something like this:

[find all record that match ] —> [process them] —> [update the record to be kept] —> [delete the other records] —> [loop if there are more]

At this point, I’m not sure what node to use to find the dupes.

  • n8n version 1.7.1
  • Default database
  • n8n running in Docker on Digital Ocean.
  • Running on Ubuntu 22.04.3 LTS

Hi @russellkg :wave:

Have you checked out the Item Lists node? There’s an option to reove duplicates in it :+1:

Yes. But I need to inspect the records first so I can see if there is relevant data that needs to be transferred to the record to be kept.

Theses are maintenance records that may contain important information for our clients. I can’t just find all the dupes and eliminate all but one, not knowing what information will be gone forever.

-Russ

Hi @russellkg - How do your records currently look like? Which bits of data would need to be kept for each record?

I’d need some example data to help you out here :bowing_man:

This is a real world example. Six records, two records each from three computers. In all cases, assets.id will be unique. So if assets.primary_serial matches, and assets.id don’t, it’s a dupe.

Various details may or may not match. I’m only counting on the serial number matching.

The fields we’re interested in maintaining and merging when a duplicate come about are “Operating System Notes” and “Notes”. These are freeform text fields.

My idea is to match the serial numbers, then check these two fields along with created_at, updated_at, and if cards from other platforms exist. Finally migrate relevant data to the correct record.

I don’t know which record(s) the remove duplicates node will delete.

[
  {
    "assets": {
      "id": 2523,
      "company_id": 30,
      "asset_layout_id": 30,
      "slug": "pmhw-mac-fvfjk7hc1wg7-a05cd399dcfe",
      "name": "LisaR MBA M1",
      "primary_serial": "FVFJK7HC1WG7",
      "primary_mail": "",
      "primary_model": "MacBook Air (M1, 2020)",
      "primary_manufacturer": "Apple, Inc",
      "company_name": "Portland Mental Health",
      "object_type": "Asset",
      "asset_type": "Computers",
      "archived": false,
      "url": "https://kginger.huducloud.com/a/pmhw-mac-fvfjk7hc1wg7-a05cd399dcfe",
      "created_at": "2022-11-04T21:00:14.577Z",
      "updated_at": "2023-10-02T20:13:49.559Z",
      "fields": [
        {
          "id": 20255,
          "value": "lisa.rodriguez",
          "label": "Last User",
          "position": 1
        },
        {
          "id": 20256,
          "value": "FVFJK7HC1WG7",
          "label": "Serial Number",
          "position": 2
        },
        {
          "id": 20257,
          "value": "MacBook Air (M1, 2020)",
          "label": "Model",
          "position": 3
        },
        {
          "id": 20258,
          "value": "macOS 13.5.1",
          "label": "Operating System",
          "position": 4
        },
        {
          "id": 20259,
          "value": "",
          "label": "Last Report",
          "position": 5
        },
        {
          "id": 20275,
          "value": "Apple, Inc.",
          "label": "Manufacturer Name",
          "position": 7
        },
        {
          "id": 20265,
          "value": "",
          "label": "Asset Tag",
          "position": 8
        },
        {
          "id": 20261,
          "value": "192.168.86.39",
          "label": "Primary IP",
          "position": 9
        },
        {
          "id": 27292,
          "value": "16",
          "label": "Installed RAM",
          "position": 10
        },
        {
          "id": 27293,
          "value": "245.0000000000",
          "label": "Boot Volume Capacity",
          "position": 11
        },
        {
          "id": 27294,
          "value": "199.0000000000",
          "label": "Boot Volume Usage",
          "position": 12
        },
        {
          "id": 20260,
          "value": "",
          "label": "Hostname",
          "position": 13
        },
        {
          "id": 20262,
          "value": "c0:95:6d:2d:8e:b5",
          "label": "MAC Address",
          "position": 14
        },
        {
          "id": 20278,
          "value": "[]",
          "label": "Location Name",
          "position": 15
        },
        {
          "id": 20270,
          "value": null,
          "label": "Warranty Expires At",
          "position": 16
        },
        {
          "id": 20263,
          "value": "",
          "label": "Default Gateway",
          "position": 17
        },
        {
          "id": 20272,
          "value": null,
          "label": "Purchase Date",
          "position": 18
        },
        {
          "id": 20267,
          "value": "",
          "label": "Purchased By",
          "position": 19
        },
        {
          "id": 20271,
          "value": null,
          "label": "Installation Date",
          "position": 20
        },
        {
          "id": 20280,
          "value": "",
          "label": "Contact Name",
          "position": 21
        },
        {
          "id": 20266,
          "value": "",
          "label": "Installed By",
          "position": 22
        },
        {
          "id": 20264,
          "value": "",
          "label": "Serial Number2",
          "position": 23
        },
        {
          "id": 20269,
          "value": "",
          "label": "Operating System Notes",
          "position": 24
        },
        {
          "id": 20268,
          "value": "",
          "label": "Notes",
          "position": 25
        },
        {
          "id": 20254,
          "value": "LisaR MBA M1",
          "label": "Name",
          "position": 26
        }
      ]
    }
  },
  {
    "assets": {
      "id": 4648,
      "company_id": 30,
      "asset_layout_id": 30,
      "slug": "pmhw-macbookair-fvfjk7hc1wg7-91dfc0e3494a",
      "name": "LisaR MBA M1",
      "primary_serial": "FVFJK7HC1WG7",
      "primary_mail": "",
      "primary_model": null,
      "primary_manufacturer": null,
      "company_name": "Portland Mental Health",
      "object_type": "Asset",
      "asset_type": "Computers",
      "archived": false,
      "url": "https://kginger.huducloud.com/a/pmhw-macbookair-fvfjk7hc1wg7-91dfc0e3494a",
      "created_at": "2023-07-14T01:00:15.680Z",
      "updated_at": "2023-10-03T01:00:06.085Z",
      "fields": [
        {
          "id": 27298,
          "value": "lisa.rodriguez",
          "label": "Last User",
          "position": 1
        },
        {
          "id": 27295,
          "value": "FVFJK7HC1WG7",
          "label": "Serial Number",
          "position": 2
        },
        {
          "id": 27296,
          "value": "MacBook Air",
          "label": "Model",
          "position": 3
        },
        {
          "id": 27304,
          "value": "macOS 13.5.1",
          "label": "Operating System",
          "position": 4
        },
        {
          "id": 27297,
          "value": "Apple, Inc.",
          "label": "Manufacturer Name",
          "position": 7
        },
        {
          "id": 27299,
          "value": "192.168.86.39",
          "label": "Primary IP",
          "position": 9
        },
        {
          "id": 27302,
          "value": "16",
          "label": "Installed RAM",
          "position": 10
        },
        {
          "id": 27303,
          "value": "245.0000000000",
          "label": "Boot Volume Capacity",
          "position": 11
        },
        {
          "id": 27305,
          "value": "198.0000000000",
          "label": "Boot Volume Usage",
          "position": 12
        },
        {
          "id": 27301,
          "value": "c0:95:6d:2d:8e:b5",
          "label": "MAC Address",
          "position": 14
        },
        {
          "id": 27300,
          "value": "LisaR MBA M1",
          "label": "Name",
          "position": 26
        }
      ]
    }
  },
  {
    "assets": {
      "id": 2495,
      "company_id": 30,
      "asset_layout_id": 30,
      "slug": "pmhw-mac-h4tjk09m125f-0e3d13163b25",
      "name": "Unassigned ADE PMHW iMac H4TJK09M125F",
      "primary_serial": "H4TJK09M125F",
      "primary_mail": "",
      "primary_model": "iMac (24-inch, M1, 2021)",
      "primary_manufacturer": "Apple, Inc",
      "company_name": "Portland Mental Health",
      "object_type": "Asset",
      "asset_type": "Computers",
      "archived": false,
      "url": "https://kginger.huducloud.com/a/pmhw-mac-h4tjk09m125f-0e3d13163b25",
      "created_at": "2022-11-03T04:30:11.476Z",
      "updated_at": "2023-10-02T20:13:50.285Z",
      "fields": [
        {
          "id": 19769,
          "value": "pmhadmin",
          "label": "Last User",
          "position": 1
        },
        {
          "id": 19770,
          "value": "H4TJK09M125F",
          "label": "Serial Number",
          "position": 2
        },
        {
          "id": 19771,
          "value": "iMac (24-inch, M1, 2021)",
          "label": "Model",
          "position": 3
        },
        {
          "id": 19772,
          "value": "macOS 13.5.2",
          "label": "Operating System",
          "position": 4
        },
        {
          "id": 19773,
          "value": "",
          "label": "Last Report",
          "position": 5
        },
        {
          "id": 19789,
          "value": "Apple, Inc.",
          "label": "Manufacturer Name",
          "position": 7
        },
        {
          "id": 19779,
          "value": "",
          "label": "Asset Tag",
          "position": 8
        },
        {
          "id": 19775,
          "value": "10.8.20.191",
          "label": "Primary IP",
          "position": 9
        },
        {
          "id": 26954,
          "value": "16",
          "label": "Installed RAM",
          "position": 10
        },
        {
          "id": 26955,
          "value": "245.0000000000",
          "label": "Boot Volume Capacity",
          "position": 11
        },
        {
          "id": 26956,
          "value": "222.0000000000",
          "label": "Boot Volume Usage",
          "position": 12
        },
        {
          "id": 19774,
          "value": "",
          "label": "Hostname",
          "position": 13
        },
        {
          "id": 19776,
          "value": "9c:58:3c:ac:12:f0",
          "label": "MAC Address",
          "position": 14
        },
        {
          "id": 19792,
          "value": "[]",
          "label": "Location Name",
          "position": 15
        },
        {
          "id": 19784,
          "value": null,
          "label": "Warranty Expires At",
          "position": 16
        },
        {
          "id": 19777,
          "value": "",
          "label": "Default Gateway",
          "position": 17
        },
        {
          "id": 19786,
          "value": null,
          "label": "Purchase Date",
          "position": 18
        },
        {
          "id": 19781,
          "value": "",
          "label": "Purchased By",
          "position": 19
        },
        {
          "id": 19785,
          "value": null,
          "label": "Installation Date",
          "position": 20
        },
        {
          "id": 19794,
          "value": "",
          "label": "Contact Name",
          "position": 21
        },
        {
          "id": 19780,
          "value": "",
          "label": "Installed By",
          "position": 22
        },
        {
          "id": 19778,
          "value": "",
          "label": "Serial Number2",
          "position": 23
        },
        {
          "id": 19783,
          "value": "",
          "label": "Operating System Notes",
          "position": 24
        },
        {
          "id": 19782,
          "value": "",
          "label": "Notes",
          "position": 25
        },
        {
          "id": 19768,
          "value": "Unassigned ADE PMHW iMac H4TJK09M125F",
          "label": "Name",
          "position": 26
        }
      ]
    }
  },
  {
    "assets": {
      "id": 4879,
      "company_id": 30,
      "asset_layout_id": 30,
      "slug": "unassigned-ade-pmhw-imac-h4tjk09m125f-1a551b23038b",
      "name": "Unassigned ADE PMHW iMac H4TJK09M125F",
      "primary_serial": "H4TJK09M125F",
      "primary_mail": "",
      "primary_model": null,
      "primary_manufacturer": null,
      "company_name": "Portland Mental Health",
      "object_type": "Asset",
      "asset_type": "Computers",
      "archived": false,
      "url": "https://kginger.huducloud.com/a/unassigned-ade-pmhw-imac-h4tjk09m125f-1a551b23038b",
      "created_at": "2023-09-11T20:30:06.376Z",
      "updated_at": "2023-10-03T01:00:48.363Z",
      "fields": [
        {
          "id": 27473,
          "value": "pmhadmin",
          "label": "Last User",
          "position": 1
        },
        {
          "id": 27470,
          "value": "H4TJK09M125F",
          "label": "Serial Number",
          "position": 2
        },
        {
          "id": 27471,
          "value": "iMac",
          "label": "Model",
          "position": 3
        },
        {
          "id": 27479,
          "value": "macOS 13.5.2",
          "label": "Operating System",
          "position": 4
        },
        {
          "id": 27472,
          "value": "Apple, Inc.",
          "label": "Manufacturer Name",
          "position": 7
        },
        {
          "id": 27474,
          "value": "10.8.20.191",
          "label": "Primary IP",
          "position": 9
        },
        {
          "id": 27477,
          "value": "16",
          "label": "Installed RAM",
          "position": 10
        },
        {
          "id": 27478,
          "value": "245.0000000000",
          "label": "Boot Volume Capacity",
          "position": 11
        },
        {
          "id": 27480,
          "value": "222.0000000000",
          "label": "Boot Volume Usage",
          "position": 12
        },
        {
          "id": 27476,
          "value": "9c:58:3c:ac:12:f0",
          "label": "MAC Address",
          "position": 14
        },
        {
          "id": 27475,
          "value": "Unassigned ADE PMHW iMac H4TJK09M125F",
          "label": "Name",
          "position": 26
        }
      ]
    }
  },
  {
    "assets": {
      "id": 4665,
      "company_id": 30,
      "asset_layout_id": 30,
      "slug": "anyas-imac-m1-972b07ee1ea6",
      "name": "AnyaS iMac M1",
      "primary_serial": "FVFJK7GC1WG7",
      "primary_mail": "",
      "primary_model": null,
      "primary_manufacturer": null,
      "company_name": "Portland Mental Health",
      "object_type": "Asset",
      "asset_type": "Computers",
      "archived": false,
      "url": "https://kginger.huducloud.com/a/anyas-imac-m1-972b07ee1ea6",
      "created_at": "2023-08-03T02:30:15.254Z",
      "updated_at": "2023-10-03T01:00:33.852Z",
      "fields": [
        {
          "id": 26960,
          "value": "anya.stepanova",
          "label": "Last User",
          "position": 1
        },
        {
          "id": 26957,
          "value": "FVFJK7GC1WG7",
          "label": "Serial Number",
          "position": 2
        },
        {
          "id": 26958,
          "value": "MacBook Air (M1, 2020)",
          "label": "Model",
          "position": 3
        },
        {
          "id": 26966,
          "value": "macOS 13.5.2",
          "label": "Operating System",
          "position": 4
        },
        {
          "id": 26959,
          "value": "Apple, Inc.",
          "label": "Manufacturer Name",
          "position": 7
        },
        {
          "id": 26961,
          "value": "10.8.20.150",
          "label": "Primary IP",
          "position": 9
        },
        {
          "id": 26964,
          "value": "16",
          "label": "Installed RAM",
          "position": 10
        },
        {
          "id": 26965,
          "value": "245.0000000000",
          "label": "Boot Volume Capacity",
          "position": 11
        },
        {
          "id": 26967,
          "value": "204.0000000000",
          "label": "Boot Volume Usage",
          "position": 12
        },
        {
          "id": 26963,
          "value": "c0:95:6d:2b:fb:28",
          "label": "MAC Address",
          "position": 14
        },
        {
          "id": 26962,
          "value": "AnyaS iMac M1",
          "label": "Name",
          "position": 26
        }
      ]
    }
  },
  {
    "assets": {
      "id": 3568,
      "company_id": 30,
      "asset_layout_id": 30,
      "slug": "sumanm-mba-m1-165ea0d15c89",
      "name": "AnyaS iMac M1",
      "primary_serial": "FVFJK7GC1WG7",
      "primary_mail": "",
      "primary_model": "MacBook Air (M1, 2020)",
      "primary_manufacturer": "Apple, Inc",
      "company_name": "Portland Mental Health",
      "object_type": "Asset",
      "asset_type": "Computers",
      "archived": false,
      "url": "https://kginger.huducloud.com/a/sumanm-mba-m1-165ea0d15c89",
      "created_at": "2023-02-15T21:42:20.207Z",
      "updated_at": "2023-10-02T20:13:48.634Z",
      "fields": [
        {
          "id": 27417,
          "value": "anya.stepanova",
          "label": "Last User",
          "position": 1
        },
        {
          "id": 27414,
          "value": "FVFJK7GC1WG7",
          "label": "Serial Number",
          "position": 2
        },
        {
          "id": 27415,
          "value": "MacBook Air (M1, 2020)",
          "label": "Model",
          "position": 3
        },
        {
          "id": 27423,
          "value": "macOS 13.5.2",
          "label": "Operating System",
          "position": 4
        },
        {
          "id": 27416,
          "value": "Apple, Inc.",
          "label": "Manufacturer Name",
          "position": 7
        },
        {
          "id": 27418,
          "value": "10.0.0.46",
          "label": "Primary IP",
          "position": 9
        },
        {
          "id": 27421,
          "value": "16",
          "label": "Installed RAM",
          "position": 10
        },
        {
          "id": 27422,
          "value": "245.0000000000",
          "label": "Boot Volume Capacity",
          "position": 11
        },
        {
          "id": 27424,
          "value": "205.0000000000",
          "label": "Boot Volume Usage",
          "position": 12
        },
        {
          "id": 27420,
          "value": "c0:95:6d:2b:fb:28",
          "label": "MAC Address",
          "position": 14
        },
        {
          "id": 27419,
          "value": "AnyaS iMac M1",
          "label": "Name",
          "position": 26
        }
      ]
    }
  }
]

hi @russellkg

I suppose for that complex matching you will need to use Code node and do some js magic :slight_smile:

As Item List node will keep only the first entry and discard the dupes

1 Like

Hi @russellkg, I am not 100% sure how your logic should work based on the description, but a possible approach for keeping the unique serial numbers and all data for each primary_serial could be something this:

How should your intended logic work from there? Do you want to keep only the datasets with the latest updated_at values for each primary_serial, with the exception of your Notes and Operating System Notes (of which all values would be kept instead)?

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.