Periodic "Redis unavailable" messages

Describe the problem/error/question

I have n8n setup in queue mode with 3 deployments in Kubernetes, main, workers, and webhooks. Everything seems to be running fine, but from logs I can see about every hour there is a ‘Redis unavailable’ error message which shows up.

Overnight, it was at exact 60 minute intervals, and things are still working, but i am trying to understand this error. The redis (keydb) in question does not have such pereodic downtime or anything.

The connection happens in the first place, as you can see the data in redis, you can trigger workflows, env vars across all the containers are same, so no issue from that angle.

What is the error message (if any)?

I enabled debug level logging, but doesn’t seem to give much more info.

2023-09-13T12:02:59.052Z | e[33mwarne[39m     | e[33mRedis unavailable - trying to reconnect...e[39m "{ file: 'RedisServiceBaseClasses.js' }"

Its this exact message, which just comes up about every hour.

Information on your n8n setup

  • n8n version: 1.6.1
  • Database (default: SQLite): PostgreSQL 15.4
  • n8n EXECUTIONS_PROCESS setting (default: own, main): main (queue mode)
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Docker via Kubernetes

Hm, n8n does have some recovery logic to deal with Redis timeouts and connection problems, but it’s not quite clear to me why this would happen in 60 minute intervals exactly. Perhaps @krynble has an idea?

Can you check if redis is configured to have client-timeouts by running CONFIG GET timeout in redis-cli ?

The timeout is 0.

Do you think this could be just something to do with my KeyDB (master-master) setup, like perhaps the connection switches the node, and some session data is not there or something.

Either way, from what I understand this shouldn’t be a deal-breaker, as n8n can handle this momentary loss and not miss any data.

Interesting, I have never seen this issue in my tests, so maybe this is in fact related to the setup.

If you see this log message only once then it means n8n identified a connectivity loss and was able to restablish.

If the error persists, n8n continues retrying for 10 seconds and exits, so if n8n has displayed this message in the logs only once, it means it should have successfully restablished the connection.

I’d say everything should be working fine, given what you just said. Let us know if you find anything abnormal.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.