Monitoring Website content changes with n8n

I wanted a system to monitor website content changes and notify me. So I made it using n8n.

Especially my competitor blogs. I wanted to know how often they are posting new articles. (I used their sitemap.xml file) (The below workflow may vary)

In the Below example, I used HackerNews for example.

Explanation:

  • First HTTP Request node crawls the webpage and grabs the website source code
  • Then wait for x minutes
  • Again, HTTP Node crawls the webpage
  • If Node compares both results are equal if anything is changed. It’ll go to the false branch and notify me in telegram.

Here is the Telegram Response.

image

Workflow:

This use case can be expanded with libraries such as diff.

If you don’t want to use the Wait node. You can store the first HTTP Node result somewhere and compare it later using another workflow.

Hope this helps somebody.

This won’t work well for Jamstack sites. Since the DOM will be rendered after the complete page load

Enjoy :tada:

11 Likes

Great, thanks a lot for sharing!

1 Like

Thanks, I like how you use Telegram to get updates.
I have been using Telegram as a simple way to get n8n to send me push notifications on my phone for all sorts of n8n flows :slight_smile:

1 Like

Telegram is really useful app. :100:

n8n + Telegram is god send. :heart:

I don’t use WhatsApp, Telegram is only for a few friends and Work, so no distractions.

Hey @mcnaveen,

Nice work, have you submitted it on n8n.io/workflows as well?

1 Like

Yes bro. Already shared :100::sparkles:

1 Like

Any thoughts on how we could modify this to take a pdf snapshot of the website when it changes?

Also there are any known restrictions to the HTTP node, e.g. below just times out:

Yes. Just use the Pupeteer node to get a screenshot of the page that changed.

This will need some changes in the workflow which will trigger a puppeteer node if there are changes on the website, then shares the screenshot to the telegram node. This way you can be able to see exactly what changed

Another alternative would be using the sitemap of the website. The changes to track will be in the xml file which you can the use to get specific pages that were added or removed not just the changes in page content.

Hope this helps

use puppeteer or browserless