Recurring Data Loss and Execution Gaps in n8n Cloud

Describe the problem/error/question

We have had 3 instances all about a month a part where 5-14ish days of data will go missing both from execution logs and saves of our workflows. Currently we have backups running to Dropbox every 2 hours but that still isn’t a fix for the issue. Does anyone else have experience with this? We avoid code nodes and are on a paid plan, we also have time outs set on workflows but after our most recent crash we found flows that were at 1800+ minutes when the time out time was set to 5 minutes.

Support has told me they are working on it, but they have never given me any things to try or do besides backing up things which we were already doing. I’ve never had this happen on competitors so I’m really concerned that n8n may not be the place for us. Any tips or tricks would be highly appreciated.

What is the error message (if any)?

None given, just all of a sudden saves of workflows (sometimes the entire workflow) is just gone. And then we look at the executions there is a gap of days between today and sometime in the past. We run over 50k runs a month so there are not gaps of days in our workflows.

Information on your n8n setup

  • n8n version: 1.34.2 at time of error, now updated to 1.36.2
  • Database (default: SQLite): Default n8ncloud
  • Running n8n via (Docker, npm, n8n cloud, desktop app): n8n cloud
2 Likes

I don’t think this is something I or other community members will be able to help with.

I will escalate this to the support team on here, but the company is currently at an in person event so they warned their responses may be a little slower than usual.

The good news is n8n moves pretty fast and might have fixed that issue already, but this is the first time i heard of this issue and i don’t remember anything about it in patch notes. Is the issue still ongoing or has it not happened since updating?

That is definitely concerning though. Always make sure you’re backing up if it’s important to retain the runs

It only happens about once a month so it’s hard to say if it will happen again. They have emailed me back a few times and took a look (this time, they have never looked before). But told me that I should upgrade to their enterprise plan (something that would cost me $18k more than what I am paying currently.) to get Postgres as the back end database as it’s more reliable.

They also said workflows weren’t missing, I sent back a few that were and they haven’t gotten back to me about that yet.

Over all to have this happen 3 times in 3 months across versions is not very encouraging. Especially since no one has been able to give me an answer as to why it’s happening, I’ve been more told that they don’t see it happening except the execution logs.

Thanks for forwarding this @liam and I love the recommendation to ensure proper backups.

Hey @Josh_Sorenson, I’m so sorry you’re experiencing some frustration with our cloud product. We love how much you’re trying to build with it and hate that you’ve run into some difficulties.

I can see in your email to our team on Friday night that you shared a couple of workflows you say were missing, I appreciate you providing that additional context. We are available during Berlin business hours (CET) M-F and will be able to follow up with you during that window.

The issue regarding missing execution logs was mentioned as a known limitation of instances using SQLite as the underpinning DB. Our recommendation for you was to consider our enterprise plan as that by default uses Postgres and you’d also have access to many other really cool features that may interest you as you scale your business. But that’s a choice you’ll have to make. You also can look into self-hosting where Postgres is also an option.

But rest assured, our team has absolutely not forgotten about you and we will do everything we can to get you back on track. Thank you and have a great rest of your weekend!

Hey Ludwig, thanks for the reply. The known issue you list is not what we are experiencing as there is a gap in the execution logs. What is described on the link is the old logs roll off. What we are seeing though is logs missing completely from March 29th-April 11th, while still existing before March 29th and after April 11. While what is described in the article sounds like a more graceful roll off of older executions which I am totally fine with and understand.

Hey @Josh_Sorenson! I wanted to check in to see how things have been going since we switched to SQLite connection pooling. Has that solved your issues?

So far so good, but it usually was about a month between incidents :crossed_fingers:

1 Like