Version 0.236.1 upgrade fail on Docker Compose

pcctw168 · July 18, 2023, 2:19pm

Hi n8n Team,

I use this way to upgrade for a long time, and it works perfectly.
I used version 0.234.1 before I upgrade this time(0.236.1), but I cannot enter the system
and it always shows “n8n is starting up. Please wait”

This is my docker-compose.yml file, I don’t know if I changed something by mistake?
Please help! Thanks!

docker-compose.yml
version: “3”

services:
traefik:
image: “traefik”
restart: always
command:

“–api=true”
“–api.insecure=true”
“–providers.docker=true”
“–providers.docker.exposedbydefault=false”
“–entrypoints.web.address=:80”
“–entrypoints.web.http.redirections.entryPoint.to=websecure”
“–entrypoints.web.http.redirections.entrypoint.scheme=https”
“–entrypoints.websecure.address=:443”
“–certificatesresolvers.mytlschallenge.acme.tlschallenge=true”
“–certificatesresolvers.mytlschallenge.acme.email=${SSL_EMAIL}”
“–certificatesresolvers.mytlschallenge.acme.storage=/letsencrypt/acme.json”
ports:
“80:80”
“443:443”
volumes:
${DATA_FOLDER}/letsencrypt:/letsencrypt
/var/run/docker.sock:/var/run/docker.sock:ro

n8n:
image: n8nio/n8n:0.236.1
restart: always
ports:

“127.0.0.1:5678:5678”
labels:
traefik.enable=true
traefik.http.routers.n8n.rule=Host(${SUBDOMAIN}.${DOMAIN_NAME})
traefik.http.routers.n8n.tls=true
traefik.http.routers.n8n.entrypoints=web,websecure
traefik.http.routers.n8n.tls.certresolver=mytlschallenge
traefik.http.middlewares.n8n.headers.SSLRedirect=true
traefik.http.middlewares.n8n.headers.STSSeconds=315360000
traefik.http.middlewares.n8n.headers.browserXSSFilter=true
traefik.http.middlewares.n8n.headers.contentTypeNosniff=true
traefik.http.middlewares.n8n.headers.forceSTSHeader=true
traefik.http.middlewares.n8n.headers.SSLHost=${DOMAIN_NAME}
traefik.http.middlewares.n8n.headers.STSIncludeSubdomains=true
traefik.http.middlewares.n8n.headers.STSPreload=true
traefik.http.routers.n8n.middlewares=n8n@docker
environment:
N8N_BASIC_AUTH_ACTIVE=true
N8N_BASIC_AUTH_USER
N8N_BASIC_AUTH_PASSWORD
N8N_HOST=${SUBDOMAIN}.${DOMAIN_NAME}
N8N_PORT=5678
N8N_PROTOCOL=https
NODE_ENV=production
WEBHOOK_URL=https://${SUBDOMAIN}.${DOMAIN_NAME}/
GENERIC_TIMEZONE=${GENERIC_TIMEZONE}
N8N_EMAIL_MODE=smtp
N8N_SMTP_HOST=smtp.gmail.com
N8N_SMTP_PORT=465
[email protected]
N8N_SMTP_PASS=REDACTED
[email protected]
N8N_SMTP_SSL=true
volumes:
${DATA_FOLDER}/.n8n:/home/node/.n8n

EmeraldHerald · July 18, 2023, 4:06pm

Hi @pcctw168 Sorry you’re running into this!

Thanks so much for sharing your Docker compose file - would you happen to have any logs you could share, as well? That might help us troubleshoot with you!

pcctw168 · July 19, 2023, 4:13am

Thanks for your quick feedback.

I try to use command “docker-compose logs” to get the log,
but the output is too long and cannot see all logs. It shows a lot of message like this:

query is slow: INSERT INTO “TMP_execution_entity” SELECT * FROM “execution_entity” LIMIT 431710, 10
n8n_1 | execution time: 38131

I try to capture the screenshot as file, is it useful?
Or how can I give you more information?

Many Thanks!

EmeraldHerald · July 19, 2023, 1:08pm

Hi @pcctw168 - thanks for that!

I don’t think anything in particular is wrong here - the version you’re upgrading to has a mandatory database migration, and if you have a particularly large database, it’s going to take some time Taking a look at your screenshot, there’s some progress, for example:

...LIMIT 381170, 10
...LIMIT 381500, 10

The Select syntax in SQLite is SELECT columns FROM table LIMIT offset, count; - you can see this in the screenshot, as n8n is fetching 10 rows starting at row 381170 to migrate. The next (slow) query has it fetching 10 rows starting at 381500, so there’s progress.

You’ll likely need to run this for quite a bit, but it’s working as intended!

DeskDude47 · July 21, 2023, 3:35pm

I’m having a similar issue here. My logs have been stuck for the past hour, and I see almost constant 99% CPU usage on the host machine. Any help or update here would be super appreciated.
Screenshot 2023-07-21 103333

EmeraldHerald · July 21, 2023, 3:40pm

I once again will call on @krynble to the rescue here

krynble · July 21, 2023, 4:26pm

What @EmeraldHerald said is right, n8n is copying 10 executions at a time in order to optimize the execution table.

I wouldn’t recommend stopping the migration if it’s still running; it’s progressing, although slowly.

In case you have already stopped it, your data might be corrupt, but if not, you can simply choose to empty the execution_entity table and it would greatly reduce the amount of data to be transferred, causing your migration to be almost instant instead.

I hope this helps!

Markus08 · July 21, 2023, 7:07pm

I have exactly the same problem. Unfortunately I do not know which version I had before. But my database is 2,5Gb. and I get the following log:

root@Ubuntu-Server:~# docker-compose logs
Attaching to root_n8n_1, root_traefik_1
n8n_1 | License manager not initialized
n8n_1 | Last session crashed
n8n_1 | n8n ready on 0.0.0.0, port 5678
n8n_1 | Migrations in progress, please do NOT stop the process.
n8n_1 | Pruning was requested, but was not enabled
n8n_1 |
n8n_1 | Stopping n8n…
n8n_1 | License manager not initialized
n8n_1 | Last session crashed
n8n_1 | n8n ready on 0.0.0.0, port 5678
n8n_1 | Migrations in progress, please do NOT stop the process.
n8n_1 | Pruning was requested, but was not enabled
traefik_1 | time=“2023-07-21T16:11:23Z” level=info msg=“Configuration loaded from flags.”
traefik_1 | time=“2023-07-21T16:20:19Z” level=error msg=“accept tcp [::]:443: use of closed network connection” entryPointName=websecure
traefik_1 | time=“2023-07-21T16:20:19Z” level=error msg=“accept tcp [::]:8080: use of closed network connection” entryPointName=traefik
traefik_1 | time=“2023-07-21T16:20:19Z” level=error msg=“close tcp [::]:8080: use of closed network connection” entryPointName=traefik
traefik_1 | time=“2023-07-21T16:20:19Z” level=error msg=“close tcp [::]:443: use of closed network connection” entryPointName=websecure
traefik_1 | time=“2023-07-21T16:20:19Z” level=error msg=“accept tcp [::]:80: use of closed network connection” entryPointName=web
traefik_1 | time=“2023-07-21T16:20:19Z” level=error msg=“close tcp [::]:80: use of closed network connection” entryPointName=web
traefik_1 | time=“2023-07-21T16:27:09Z” level=info msg=“Configuration loaded from flags.”

I have stopped and restarted docker several times and my CPU usage is also at 100%. I find this very unusual. because that was never the vall that it took so long. Should I just delete all the data in the table ‘execution_entity’? There are 322172 rows.

pcctw168 · July 23, 2023, 8:51am

Because I have been unable to upgrade smoothly before, I have re-operated the upgrade steps. The syntax is as follows:

# Stop current setup
sudo docker-compose stop
# Delete it (will only delete the docker-containers, data is stored separately)
sudo docker-compose rm
# Then start it again
sudo docker-compose up -d

It has been five days since the first time I posted an inquiry, should I give up?
If I am willing to abandon the previous workflows, is there any way to quickly restart the service?

krynble · July 23, 2023, 10:19am

@Markus08 If you can afford deleting the data in the execution_entity, that would certainly cause the migration to run instantly. It’s worth trying.

Lastly, if you don’t mind losing executions, another possible approach is:

Stop n8n
Export workflows and credentials using the CLI commands as shown here
Rename your database.sqlite file to something else to serve as a backup
Run the Import commands using the instructions in the link above - n8n will create a new database.sqlite file and import those workflows and credentials there
Start n8n

This is a way of continuing to use n8n without losing workflows and credentials.

Those steps should also apply to you @pcctw168

Markus08 · July 24, 2023, 10:58am

I downloaded the .sglite file from the server and opened it in SQLite Studio on Windows. Then I used a query to delete all executions except the last 100. I had ChatGPT create the query because I don’t know how to write and create it properly. Then I still used a query command ‘VACUUM;’ to minify the database. Because it was, despite the deleted Executions entries, still 2,5Gb big. After downsizing, the database was then only 5MB in size.
After replacing the .sqlite file with the existing file on the server, I restarted Docker. But the server load was still at 100% and nothing happened. Then I deleted the last row of the migration table in the database. And then the update worked immediately and my n8n installation was still accessible through the web browser. All workflows and settings were still available.
I worked on this for almost 6 hours until I found the solution. I was very relieved that I had not lost my data.

Markus08 · July 24, 2023, 11:00am

Is it actually possible to set n8n to save only a maximum of 100 executions per workflow? So that everything older than entry 100 is automatically deleted? And that the database is continuously reduced automatically?

Markus08 · July 24, 2023, 11:22am

Around n8n updates and backups, there is still a lot of potential for improvement.
My suggestion would be to create another web environment next to the regular n8n installation. Maybe with a separate Docker container. This environment is then only responsible for the updates and backups and can be completely controlled via the browser. This environment monitors the update and constantly checks if the n8n installation is running correctly. Especially for critical updates, it would be advantageous if this watchdog application then creates a backup and performs the update. If something goes wrong here, it can be informed in the browser. So the whole thing can be canceled or restarted.
Not everyone is a programmer and understands how n8n and Linux works.
Another improvement would be if an n8n update is first performed in parallel to the running n8n installation in a newly created Docker container. As soon as this is done successfully, the new Docker container installation is transferred to the old one. Then the executions that have occurred in the meantime are transferred to the new update database, so that there is no gap in the execution history.
Because an n8n update that takes several hours is not acceptable. Our n8n workflows must be available 14/7. An update where n8n is not available for such a long time is a very big problem.
What do you think about my suggestions?

DeskDude47 · July 24, 2023, 1:55pm

To anyone that it might help. I just let this run over the weekend per the advice of the guys from the n8n Team, and it took over 48 hours, but the migration did finish and n8n was able to start fine. My docker logs would go 5-6 hours between logging an update from the migration, so it really does seem like nothing is happening.

kevintee · July 26, 2023, 12:56am

In the steps you outlined, you mentioned to Stop n8n at the first step. Once I stop n8n running, and then trying to export the the workflows and credentials, I get the error - Error response from daemon: Container b4092a1a275fd6577c0b46a7da5ce41f198bc18881f4790d6ea9bdce20cf0e8b is not running

Can you elaborate further on the right way of stopping n8n?

I’m a big noob on this - so apologies in advance if this sounds trivial

krynble · July 28, 2023, 7:38am

@Markus08 this is def a good idea, the auto update mechanism certainly could be improved; for now we strongly suggest backing up your database before migrating as this is the sensible part usually. We try our best to make upgrading as easy as possible by testing the upgrade process multiple times, but sometimes we need to make big, fundamental changes, and in those situations upgrading may be hard.

@kevintee What you are facing is actually expected, once you stop n8n you won’t be able to run commands against it. Have you mapped the n8n’s folder to your local machine? If so, I would expect you to be able to see the .n8n folder inside your user’s home folder.

If so, you don’t need docker to run the commands; you can simply run npx [email protected] export:workflow and then similar to export credentials.

I hope the above steps help.

krynble · July 28, 2023, 7:45am

@kevintee also if you haven’t mapped the folder to your local filesystem, you can copy the file from the stopped container as explained here.

Basically: docker cp <container name>:/home/node/.n8n/database.sqlite ~/database.sqlite - this should work, copying the file to your home folder.

system · October 26, 2023, 7:46am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.