I have a simple workflow that triggers every 5 minutes. It uses the HTTP Request to retrieve a page of data, then the data is parsed with HTML Extract matching a couple of id tags. When it finds matching data in the tags - it sends me an email.
The problem is that the HTTP Request is not updating the data with the current information from the website and continues to use the data that was downloaded on the initial request each time the trigger is fired. My guess is that it is using cached data.
How do I force the HTTP Request to pull fresh data from the site on each request?
n8n 1.53.2 self hosted in docker on a synology nas.
It depends on how the web server is configured. Typically a dedicated header in HTTP request could instruct the server to respond with the data that was not cached. Another trick is to use different URL (namely query string) that would still point to the same page but fool the server into serving the fresh data.
To summaries, try one of the following
Add header Cache-Control with the value no-cache, no-store
Add random query string at the end of the URL, something like ?{{ $now.toMillis() }} assuming the server will ignore that query string
After trying both methods suggested and still not getting accurate data from the page source, I tried using curl from the command line and it returns the same incorrect results.
When I access the site with a browser, it shows the data and image status indicator correctly, and using the browser inspector I can verify the underlying data is indeed in the βonβ status.
When I use curl, it shows the data as off with the indicator as off.
There must be some javascript or other operating in the browser that changes the state of the data when viewed in a browser.
I will have to dig further. Thank you for your help.
Having dug further it turns out that the webpage used javascript to call another url to get further data and update the status.
Using the inspector I was able to get this url and add a second request to my workflow then merge the data before testing for conditions and sending an email.
Thank you again for your help. It was using curl at the command line that pointed me in the right direction.