HTTP Request - not refreshing data

I have a simple workflow that triggers every 5 minutes. It uses the HTTP Request to retrieve a page of data, then the data is parsed with HTML Extract matching a couple of id tags. When it finds matching data in the tags - it sends me an email.

The problem is that the HTTP Request is not updating the data with the current information from the website and continues to use the data that was downloaded on the initial request each time the trigger is fired. My guess is that it is using cached data.

How do I force the HTTP Request to pull fresh data from the site on each request?

n8n 1.53.2 self hosted in docker on a synology nas.

Any help would be appreciated.

It looks like your topic is missing some important information. Could you provide the following if applicable.

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

@robin.ellins , if cache indeed is an issue you can try to use the tricks described in https://www.baeldung.com/linux/curl-without-cache.

It depends on how the web server is configured. Typically a dedicated header in HTTP request could instruct the server to respond with the data that was not cached. Another trick is to use different URL (namely query string) that would still point to the same page but fool the server into serving the fresh data.

To summaries, try one of the following

  • Add header Cache-Control with the value no-cache, no-store
  • Add random query string at the end of the URL, something like ?{{ $now.toMillis() }} assuming the server will ignore that query string
1 Like

Thank you for the excellent and succinct answer.

After trying both methods suggested and still not getting accurate data from the page source, I tried using curl from the command line and it returns the same incorrect results.

When I access the site with a browser, it shows the data and image status indicator correctly, and using the browser inspector I can verify the underlying data is indeed in the β€œon” status.

When I use curl, it shows the data as off with the indicator as off.

There must be some javascript or other operating in the browser that changes the state of the data when viewed in a browser.

I will have to dig further. Thank you for your help.

Having dug further it turns out that the webpage used javascript to call another url to get further data and update the status.

Using the inspector I was able to get this url and add a second request to my workflow then merge the data before testing for conditions and sending an email.

Thank you again for your help. It was using curl at the command line that pointed me in the right direction.

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.