Often my n8n has been down, the access page is inaccessible, just loading forever, when I try to save some workflows it happens to give an error, then I manage to save, but when I check it ends up that the workflows maintain both the old configuration and the new one, for example, if I set a trigger at 08:35 and then changed it to 8:40, it happened to activate the worflow both times: 8:35 and 8:40
I thought it was due to the number of accumulated executions and because the version was out of date, so I cleaned the old executions (from 570 thousand it dropped to 12 thousand)
I updated the 26 versions that were overdue and now Iâm on the most recent one: 1.33.1
I upgraded my droplet on digitalocean too, I currently have 8 GB Memory / 4 Intel vCPUs / 160 GB Disk / NYC1 (about 2 to 4x larger than the initial size)
When there is an error, in Digitalocean the CPU reaches 30% usage, I believe the problem is not there
Anyway, I did all the updates and upgrades, but sometimes, out of the blue, even without many workflows running, the site simply goes offline, without receiving data, nor being able to access it, forcing me to turn the droplet off and on, What can it be?
I woke up and it was offline, I went to the logs inside the digitalocean console and searched for ân8nâ and with âerrorâ or âdisconnectedâ and these logs appeared
Is it possible to find out the reason from the image?
When I was loading the logs, I noticed that there was a lot of the event below up to the current date, but when it finished, when I searched for the keywords, it didnât appear, but I managed to copy a piece from when I was copying, the date is older, but I noticed that the The same thing happened with yesterday and todayâs date, if it has any relevance:
Well⌠the error is not very specific but seems there is a lot of executions in the queue.
Can you check the memory state? n8n mostly uses memory instead of CPU/disk. But 8GB usually should be fine.
The only thing I can advice for now is to collect the logs If the n8n will go offline again and provide them here, so maybe there will be something else.
You can export logs to the /tmp/n8n.log file with the command:
docker logs --since 1h root_n8n_1 > /tmp/n8n.log
And you can replace the 1h in the --since key with actual time when the n8n was down (e.g. 1-2-3h ago or more). Because the container will restart and the needed logs may be hidden.
Thousands of the same phrase that I had sent earlier appeared, where they were unable to identify what it was, but a different phrase also appeared:
Removed triggers and pollers for workflow â22â
2024-03-26T12:26:09.204Z [Rudder] info: Your message must be < 32kb. This is currently surfaced as a warning. Please update your code {
userId: â9b8c11ec1d7722d48b6973dbf70400faa913ae64e46fef1f40a9b1d654824ea2#3af181f5-9769-43d2-aa32-64ab62fd7fe7â,
event: âUser saved workflowâ,
properties: {
user_id: â3af181f5-9769-43d2-aa32-64ab62fd7fe7â,
workflow_id: â22â,
Earlier I imagined that the problem could be this flow, and when I deactivated it, until then the system went offline again, it is programmed to happen every hour, and the problem always came after it started working. âŚ
Before the update there was no problem, what could it be?
I noticed that some of my flows had some changes in the nomenclature of the webhook trigger, instead of having the name âWebhookâ, some had âWebhook1â or âWebhook2â which meant that the information did not continue in the flow, as the origin was unknown, so I changed the nomenclature on them
Was there any change with the version update that impacted something in this workflow 22? How do I share it with you to check if this really is the problematic workflow, and how to solve it?
Can you share the workflow 22? Iâm wondering why it has so many nodes.
You can copy the workflow with ctrl+a > ctrl+c and then paste it after pressing the button </>
But there is a limits for the message length, so it may be required to copy it to the file and share the file from any storage.
I used a cron node to activate every hour every 10 minutes, between 7am and 11pm
The flow consists of sending some information, waiting about 50 seconds, and then checking custom fields in âBotConversaâ via http request
If the Custom Field has a value different from that stipulated, it means that there is an error in âBotConversaâ, it usually means that my WhatsApp number has been banned or disconnected, in this case it makes an http request with a call to my cell phone via zenvia informing this
Each line refers to a WhatsApp number, totaling around 30 lines with the same function
Well⌠that definitely may be an issue. Can you show how the Schedule node configured?
In total you have about 30 minutes of wait time + some response time from the HTTP nodes, so the flow may work for more than 40 minutes long or even longer.
Iâm wondering if there are overlaps happen (when the first execution is still working and the next one started to work) that may lead to some issues with performance.
But generally itâs better to redesign the workflow to exclude the duplicates and common patterns (you have basically one line with the logic and 30+ lines with the specific parameters to that logic). I can help with optimizing that flow, but it will require some amount of time.
@Jon or @bartv when we updating the n8n to the newer version, will the nodes automatically be updated to the latest version or they keep the old version?
before the newest update (it had 26 accumulated) they work normally, I donât know if anything changed due to the update This, normally the flow lasts around 30 minutes, as it only happens every hour, I have never had an overlap but also, since I did the update (yesterday) I was never able to complete the flow, we can rule out this issue of overlapping, as the system has already crashed from the beginning
I duplicated the workflow, and removed the nodes, left 10 rows of nodes, manually activated the workflow, it worked perfectly
So I copied another 10 lines from the other workflow to test, and ran it manually
In other words, with 20 lines it worked perfectly
But when I tried to place the other lines, it gave me an error again, before even starting the workflow, just clicking stop start crashedâŚ
Before the update it worked perfectly, all day, every day
I donât understand why the error occurs right at the beginning of execution, shouldnât the other data in the flow be loaded over time?
Iâm a layman, I donât know how to check the database, in fact I barely know what it means
I use n8n via docker compose on digitalocean and access the droplet via the web console, how do I check which database is for you?
To start the workflow, n8n must validate the workflow in the beginning (It wonât let you activate the workflow if it has any errors), so there can be an issue, but it seems more like a bug.
Plus if you are using the default database, then it also can cause issues, but as you had tons of executions, seems itâs not.
I duplicated the flow and divided it into 2 parts, with 16 lines and 15 lines respectively, one starting at 10 minutes of each hour, and the second at 30 minutes of each hour
In the first hour both worked correctly, without the n8n going offline
I believe the problem is more in processing the number of lines, rather than a specific part of the workflow