RSS Feed -> Check if New -> Google Sheet (not working 😢)

Heya,

I want to build a scraper for sourcing startups. Baby steps, I’m using the RSS feed from producthunt. I want to every day call the RSS feed, check what startups are new and then put them in a google sheet.

Currently, I’ve got the RSS feed, setting the titles and posting to google sheet but my code for checking if new isn’t working (I copied it off another workflow). Everytime I run the workflow it just copies everything every time.

V basic coding knowledge, so any help with how to code to look for duplicates and then only move forwards those which are new would be awesome!

{
  "nodes": [
    {
      "parameters": {
        "url": "=https://www.producthunt.com/feed?category=fintech"
      },
      "name": "ProductHunt RSS Fintech",
      "type": "n8n-nodes-base.rssFeedRead",
      "position": [
        70,
        600
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "keepOnlySet": true,
        "values": {
          "string": [
            {
              "name": "Startup Name",
              "value": "={{$node[\"ProductHunt RSS Fintech\"].json[\"title\"]}}"
            },
            {
              "name": "Founder",
              "value": "={{$node[\"ProductHunt RSS Fintech\"].json[\"author\"]}}"
            },
            {
              "name": "URL",
              "value": "={{$json[\"link\"]}}"
            },
            {
              "name": "Publication Date",
              "value": "={{$json[\"pubDate\"]}}"
            },
            {
              "name": "Startup Pitch",
              "value": "={{$json[\"content\"]}}"
            }
          ]
        },
        "options": {}
      },
      "name": "Filter RSS Data1",
      "type": "n8n-nodes-base.set",
      "position": [
        280,
        600
      ],
      "typeVersion": 1
    },
    {
      "parameters": {
        "functionCode": "const staticData = getWorkflowStaticData('global');\nconst newRSSIds = items.map(item => item.json[\"Date\"]);\nconst oldRSSIds = staticData.oldRSSIds; \n\nif (!oldRSSIds) {\n  staticData.oldRSSIds = newRSSIds;\n  return items;\n}\n\n\nconst actualNewRSSIds = newRSSIds.filter((id) => !oldRSSIds.includes(id));\nconst actualNewRSS = items.filter((data) => actualNewRSSIds.includes(data.json['Date']));\nstaticData.oldRSSIds = [...actualNewRSSIds, ...oldRSSIds];\n\nreturn actualNewRSS;\n"
      },
      "name": "Only new",
      "type": "n8n-nodes-base.function",
      "typeVersion": 1,
      "position": [
        480,
        600
      ]
    },
    {
      "parameters": {
        "authentication": "oAuth2",
        "operation": "append",
        "sheetId": "1TJoruaSz2vRF-0UrsRzucK8kwmvKjDNVD3ZK0xgllhc",
        "options": {}
      },
      "name": "Google Sheets1",
      "type": "n8n-nodes-base.googleSheets",
      "typeVersion": 1,
      "position": [
        680,
        600
      ],
      "credentials": {
        "googleSheetsOAuth2Api": {
          "id": "41",
          "name": "Edward Kandel Google Sheets"
        }
      }
    },
    {
      "parameters": {
        "triggerTimes": {
          "item": [
            {
              "hour": 20
            }
          ]
        }
      },
      "name": "Cron2",
      "type": "n8n-nodes-base.cron",
      "typeVersion": 1,
      "position": [
        -90,
        600
      ]
    }
  ],
  "connections": {
    "ProductHunt RSS Fintech": {
      "main": [
        [
          {
            "node": "Filter RSS Data1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Filter RSS Data1": {
      "main": [
        [
          {
            "node": "Only new",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Only new": {
      "main": [
        [
          {
            "node": "Google Sheets1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Cron2": {
      "main": [
        [
          {
            "node": "ProductHunt RSS Fintech",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

[running on N8n.cloud

Share the output returned by the last node

Hey @EdKandel,

The code in the Function node uses the getWorkflowStaticData method. This method will return all the outputs when you execute the workflow manually. However, this method works as expected only when your workflow is running in production. You can read more about this method here: Function | Docs

I’ve also written a blog post that is based on a similar topic. You can read it here: Creating triggers for n8n workflows using polling ⏲

Ah awesome, makes sense!