I broke my live install. Can someone offer advice on getting it running again?

Hi all,

I am in a dire situation.

I was trying to run my VPS a bit more efficiently, and add a Hugo website to the same docker compose / traefik combination, so they can run on the same server.

Got it working, but then, I edited something wrong, and boom, n8n went down.

Long story short I did a docker-compose prune -a and like an idiot. Pressed ENTER.

I re-pulld the latest (as I always have) and it failed. It gave an error to say it could not extract the TAR file again and again, so I focused on a specific version. I decided on image: n8nio/n8n:1.81.0. Randomly.

I then went and re-pulled images, and built, and got this kind of error:

docker-compose up --build
WARNING: Found orphan containers (n8n_hugo_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Removing n8n_traefik_1 ... done
Removing n8n_n8n_1     ... done
Network traefik is external, skipping
WARNING: Found orphan containers (n8n_hugo_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Creating n8n_traefik_1 ... done
Creating n8n_n8n_1     ... done
Attaching to n8n_n8n_1, n8n_traefik_1
n8n_1      | Invalid timestamp value for N8N_RELEASE_DATE: $(date -u +"%Y-%m-%dT%H:%M:%SZ")
n8n_1      | Permissions 0644 for n8n settings file /home/node/.n8n/config are too wide. This is ignored for now, but in the future n8n will attempt to change the permissions automatically. To automatically enforce correct permissions now set N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS=true (recommended), or turn this check off set N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS=false.
n8n_1      | User settings loaded from: /home/node/.n8n/config
n8n_1      | Initializing n8n process
n8n_1      | n8n ready on 0.0.0.0, port 5678
n8n_1      | 
n8n_1      | There is a deprecation related to your environment variables. Please take the recommended actions to update your configuration:
n8n_1      |  - N8N_RUNNERS_ENABLED -> Running n8n without task runners is deprecated. Task runners will be turned on by default in a future version. Please set `N8N_RUNNERS_ENABLED=true` to enable task runners now and avoid potential issues in the future. Learn more: https://docs.n8n.io/hosting/configuration/task-runners/
n8n_1      | 
n8n_1      | Version: 1.81.0
n8n_1      |  ================================
n8n_1      |    Start Active Workflows:
n8n_1      |  ================================
n8n_1      | SQLITE_ERROR: no such column: SharedCredentials__SharedCredentials_project__SharedCredentials__SharedCredentials_project_projectRelations__SharedCredentials__SharedCredentials_project__SharedCredentials__SharedCredentials_project_projectRelations_user.role
n8n_1      | SQLITE_ERROR: no such column: SharedCredentials__SharedCredentials_project__SharedCredentials__SharedCredentials_project_projectRelations__SharedCredentials__SharedCredentials_project__SharedCredentials__SharedCredentials_project_projectRelations_user.role
n8n_1      |      => ERROR: Workflow "Workflow Name Redacted for Privacy" (ID: 1) could not be activated on first try, keep on trying if not an auth issue
n8n_1      |                SQLITE_ERROR: no such column: SharedCredentials__SharedCredentials_project__SharedCredentials__SharedCredentials_project_projectRelations__SharedCredentials__SharedCredentials_project__SharedCredentials__SharedCredentials_project_projectRelations_user.role
n8n_1      | Issue on initial workflow activation try of "Workflow Name Redacted for Privacy" (ID: 1) (startup)

and on and on it goes.

I try to log in and it says

Problem logging in SQLITE_ERROR: no such column: User.role

I had an OMG moment. I deleted my database!

So I had a hunt through the system, and found these files:

└── .n8n
    β”œβ”€β”€ binaryData
       β”œβ”€β”€ meta
       └── persistMeta
    β”œβ”€β”€ config
    β”œβ”€β”€ database.sqlite
    β”œβ”€β”€ git
       └── .git
           β”œβ”€β”€ branches
           β”œβ”€β”€ config
           β”œβ”€β”€ description
           β”œβ”€β”€ HEAD
           β”œβ”€β”€ hooks
              β”œβ”€β”€ applypatch-msg.sample
              β”œβ”€β”€ commit-msg.sample
              β”œβ”€β”€ post-update.sample
              β”œβ”€β”€ pre-applypatch.sample
              β”œβ”€β”€ pre-commit.sample
              β”œβ”€β”€ pre-merge-commit.sample
              β”œβ”€β”€ prepare-commit-msg.sample
              β”œβ”€β”€ pre-push.sample
              β”œβ”€β”€ pre-rebase.sample
              β”œβ”€β”€ pre-receive.sample
              β”œβ”€β”€ push-to-checkout.sample
              └── update.sample
           β”œβ”€β”€ info
              └── exclude
           β”œβ”€β”€ objects
              β”œβ”€β”€ info
              └── pack
           └── refs
               β”œβ”€β”€ heads
               └── tags
    β”œβ”€β”€ n8nEventLog-1.log
    β”œβ”€β”€ n8nEventLog-2.log
    β”œβ”€β”€ n8nEventLog-3.log
    β”œβ”€β”€ n8nEventLog.log
    β”œβ”€β”€ nodes
       └── package.json
    └── ssh

The sql database is about 100mb which is a good sign.

The files are all located at /home/username-redacted/n8n-local-files/.n8n

So, I figure, I need to point n8n at that location, to load up the data.

I change my docker-compose file to this:

version: "3"

services:
  traefik:
    image: "traefik"
    restart: always
    command:
      - "--api=true"
      - "--api.insecure=true"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.web.http.redirections.entryPoint.to=websecure"
      - "--entrypoints.web.http.redirections.entrypoint.scheme=https"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.mytlschallenge.acme.tlschallenge=true"
      - "--certificatesresolvers.mytlschallenge.acme.email=${SSL_EMAIL}"
      - "--certificatesresolvers.mytlschallenge.acme.storage=/letsencrypt/acme.json"
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ${DATA_FOLDER}/letsencrypt:/letsencrypt
      - /var/run/docker.sock:/var/run/docker.sock:ro

  n8n:
    image: n8nio/n8n:1.81.0
    restart: always
    ports:
      - "127.0.0.1:5678:5678"
    labels:
      - traefik.enable=true
      - traefik.http.routers.n8n.rule=Host(`${SUBDOMAIN}.${DOMAIN_NAME}`)
      - traefik.http.routers.n8n.tls=true
      - traefik.http.routers.n8n.entrypoints=web,websecure
      - traefik.http.routers.n8n.tls.certresolver=mytlschallenge
      - traefik.http.middlewares.n8n.headers.SSLRedirect=true
      - traefik.http.middlewares.n8n.headers.STSSeconds=315360000
      - traefik.http.middlewares.n8n.headers.browserXSSFilter=true
      - traefik.http.middlewares.n8n.headers.contentTypeNosniff=true
      - traefik.http.middlewares.n8n.headers.forceSTSHeader=true
      - traefik.http.middlewares.n8n.headers.SSLHost=${DOMAIN_NAME}
      - traefik.http.middlewares.n8n.headers.STSIncludeSubdomains=true
      - traefik.http.middlewares.n8n.headers.STSPreload=true
      - traefik.http.routers.n8n.middlewares=n8n@docker
    environment:
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=redacted
      - N8N_BASIC_AUTH_PASSWORD=redacted
      - N8N_HOST=${SUBDOMAIN}.${DOMAIN_NAME}
      - N8N_PORT=5678
      - N8N_PROTOCOL=https
      - NODE_ENV=production
      - WEBHOOK_URL=https://${SUBDOMAIN}.${DOMAIN_NAME}/
      - GENERIC_TIMEZONE=${GENERIC_TIMEZONE}
      - DB_MIGRATIONS_RUN_ON_STARTUP=true  
    volumes:
      - /home/username-redacted/n8n-local-files/.n8n:/home/node/.n8n
      - /home/username-redacted/n8n-local-files:/files

networks:
  default:
    external:
      name: traefik

You can see I have the - DB_MIGRATIONS_RUN_ON_STARTUP=true added in, as I read that might help. However, no.

Now what should I do?

  • I DO have my workflows backed up as .json files.
  • I don’t have any of my authentications backed up.
  • I don’t know what version my n8n was, so not sure if those json backups will even work.

I’m stumped, and, I’m an idiot for doing that.

What would you all recommend?

Thanks so much, any feedback greatly appreciated.

(running debian, docker-compose)

Thank you so SO much for your answer. Be still my heart!

OK I have the following output:

root@server:~# grep β€œVersion:” /home/redacted-user/n8n-local-files/.n8n/n8nEventLog.log | tail -5

root@server:~# grep β€œVersion:” /home/redacted-user/n8n-local-files/.n8n/n8nEventLog.log

root@server:~# cd /home/redacted-user/n8n-local-files/.n8n/

root@server:/home/redacted-user/n8n-local-files/.n8n# ls

binaryData  database.sqlite  n8nEventLog-1.log	n8nEventLog-3.log  nodes

config	    git		     n8nEventLog-2.log	n8nEventLog.log    ssh

root@server:/home/redacted-user/n8n-local-files/.n8n# cat n8nEventLog.log 

root@server:/home/redacted-user/n8n-local-files/.n8n# cat n8nEventLog-1.log 

root@server:/home/redacted-user/n8n-local-files/.n8n# cat n8nEventLog-2.log 

root@server:/home/redacted-user/n8n-local-files/.n8n# cat n8nEventLog-3.log 

So as you can see, there is nothing in the log files, to figure out what version.

I noted your post is gone. Maybe you’re adjusting?

Further, I tried the latest update, and I’m sort of back to where I started:

# docker-compose down
WARNING: Found orphan containers (n8n_hugo_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Removing n8n_n8n_1     ... done
Removing n8n_traefik_1 ... done
Network traefik is external, skipping
root@server:~/n8n# docker-compose up --build
WARNING: Found orphan containers (n8n_hugo_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Pulling n8n (n8nio/n8n:latest)...
latest: Pulling from n8nio/n8n
22b37da5853a: Downloading [==========>                                        ]     388B22b37da5853a: Downloading [==================================================>]  1.882kB22b37da5853a: Extracting [==================================================>]  1.882kB/1.882kBa0de0: Waiting
cb0e99e1c627: Download complete
/9.542MBd7f5: Download complete
d1520cba0de0: Download complete
/42.11MBe1ac: Download complete
81d49b7a1dbb: Download complete
eb7d9231e3ec: Download complete
4f4fb700ef54: Download complete
c309291cc0d9: Download complete
/203.6MBa608: Download complete
3be0550b41f6: Download complete
ERROR: failed to register layer: Error processing tar file(exit status 1): archive/tar: invalid tar header

I don’t know what’s up with that tar header issue, but that’s where I went on a long mission last time (I should have asked here).

further update. I updated docker, and the tar issue is resolved.

n8n loaded! Amazing! I could log in, but, the terminal shows all my executions failing to run

n8n_1      | Try to activate workflow "Redacted Workflow webhook" (9)
n8n_1      | Unrecognized node type: n8n-nodes-base.start
n8n_1      | Activation of workflow "Redacted Workflow webhook" (9) did fail with error: "Unrecognized node type: n8n-nodes-base.start" | retry in 512 seconds
n8n_1      | Try to activate workflow "Another redacted Workflow Name" (10)
n8n_1      | Unrecognized node type: n8n-nodes-base.start
n8n_1      | Activation of workflow "Another redacted Workflow Name" (10) did fail with error: "Unrecognized node type: n8n-nodes-base.start" | retry in 512 seconds

etc
etc

hi @privateuserguy

My suggestions:
First thing: figure out what version you were actually on. Run this from your server, or just cat the log files. n8n logs the version at startup, so the answer is in those event logs sitting right there in your .n8n directory.

grep -i β€œversion” ~/<your-n8n-directory>/n8nEventLog.log

Once you have the version number, pull that exact image:

docker pull n8nio/n8n:<your-version>

Update your compose file to match that version and docker-compose up -d.

If you hit the same TAR extraction error again, try restarting the Docker daemon first then pull again. Since you already pruned everything earlier, disk space and cached layers shouldn’t be the issue.

Let me know :crossed_fingers:

Thank you so much.

Long story short I have it all working.

I had to upgrade all my old workflows also, so they used new, non redundant nodes, and add
``
- N8N_RUNNERS_ENABLED=true
- N8N_RUNNERS_MODE=external
- N8N_NATIVE_PYTHON_RUNNER=true
```

etc etc to my config. But, got there in the end! Thanks again, and I will close this now. What a monster day!

1 Like