Complete instance and all agents stop working and reply 502 / n8n Cloud

Hello, we are paying for n8n cloud and currently using primarily 3 or 4 chat agents with wehbooks, and a couple sub workflows.

The first weeks, whole project worked perfectly, but this last days, we are facing an issue where the complete instance stops working, and all agents fail at the same time, they just die, responding with a 502 or 500.

To fix this, we have to ask the admin of our organization (I am an admin as well but somehow I dont have this power) to “restart” the whole instance. But of course this is not desirable, we have agents in prod who should be replying to real customers and we don’t even get a notice when they fail.

As I said, we are paying for n8n cloud and would like to know what is going on or how can it be fixed?

Thank you all very much.