Setting up task runners in external mode with aws ecs

Describe the problem/error/question

I’m trying to set up task runners in external mode in AWS ECS.

If I set N8N_RUNNERS_MODE=external, then the code nodes will timeout.

If I set N8N_RUNNERS_MODE=internal, then the code nodes will execute successfully. (but I want external mode)

Here is my ECS task definition:

{
  "family": "n8n-worker-dev",
  "networkMode": "awsvpc",
  "taskRoleArn": "...",
  "executionRoleArn": "...",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "2048",
  "memory": "4096",
  "containerDefinitions": [
    {
      "name": "n8n-worker",
      "image": "docker.n8n.io/n8nio/n8n:1.122.2",
      "command": ["worker"],
      "portMappings": [
        {
          "containerPort": 5678,
          "hostPort": 5678,
          "protocol": "tcp"
        }
      ],
      "environment": [
        { "name": "N8N_ENCRYPTION_KEY", "value": "..." },
        { "name": "N8N_CONCURRENCY_PRODUCTION_LIMIT", "value": "200" },
        { "name": "N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS", "value": "true" },

        { "name": "N8N_RUNNERS_ENABLED", "value": "true" },
        { "name": "N8N_RUNNERS_MODE", "value": "external" },
        { "name": "N8N_RUNNERS_AUTH_TOKEN", "value": "..." },
        { "name": "N8N_RUNNERS_BROKER_LISTEN_ADDRESS", "value": "0.0.0.0" },
        { "name": "N8N_NATIVE_PYTHON_RUNNER", "value": "false" },

        { "name": "OFFLOAD_MANUAL_EXECUTIONS_TO_WORKERS", "value": "true" },
        
        { "name": "DB_TYPE", "value": "postgresdb" },
        { "name": "DB_POSTGRESDB_DATABASE", "value": "n8n" },
        { "name": "DB_POSTGRESDB_HOST", "value": "..." },
        { "name": "DB_POSTGRESDB_PORT", "value": "5432" },
        { "name": "DB_POSTGRESDB_USER", "value": "..." },
        { "name": "DB_POSTGRESDB_PASSWORD", "value": "..." },
        { "name": "DB_POSTGRESDB_SCHEMA", "value": "public" },
        { "name": "DB_POSTGRESDB_SSL_ENABLED", "value": "true" },
        { "name": "DB_POSTGRESDB_SSL_REJECT_UNAUTHORIZED", "value": "false" },

        { "name": "QUEUE_BULL_REDIS_HOST", "value": "..." },
        { "name": "QUEUE_BULL_REDIS_PORT", "value": "..." },
        { "name": "QUEUE_BULL_REDIS_TLS", "value": "..." },

        { "name": "GENERIC_TIMEZONE", "value": "America/Los_Angeles" },

        { "name": "N8N_LOG_LEVEL", "value": "debug" }
      ],
      "healthCheck": {
        "command": [ "CMD-SHELL", "wget -qO- http://localhost:5678/healthz || exit 1" ],
        "interval": 5,
        "timeout": 3,
        "retries": 2
      },
      "essential": true,
      "user": "root",
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-create-group": "true",
          "awslogs-group": "/ecs/n8n-workers",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs",
          "awslogs-multiline-pattern": "^\\S.*"
        }
      }
    },
    {
      "name": "n8n-code-runners",
      "image": "192984850590.dkr.ecr.us-east-1.amazonaws.com/n8n-runners-custom:latest-dev",
      "essential": true,
      "dependsOn": [
        {
          "containerName": "n8n-worker",
          "condition": "HEALTHY"
        }
      ],
      "environment": [
        { "name": "N8N_RUNNERS_TASK_BROKER_URI", "value": "http://n8n-worker:5679" },
        { "name": "N8N_RUNNERS_AUTH_TOKEN", "value": "..." },
        { "name": "GENERIC_TIMEZONE", "value": "America/Los_Angeles" },

        { "name": "N8N_LOG_LEVEL", "value": "debug" }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-create-group": "true",
          "awslogs-group": "/ecs/n8n-code-runners",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs",
          "awslogs-multiline-pattern": "^\\S.*"
        }
      }
    }
  ]
}

What is the error message (if any)?

In the UI I get this message:

Task request timed out after 60 seconds
Your Code node task was not matched to a runner within the timeout period. This indicates that the task runner is currently down, or not ready, or at capacity, so it cannot service your task.If you are repeatedly executing Code nodes with long-running tasks across your instance, please space them apart to give the runner time to catch up. If this does not describe your use case, please open a GitHub issue or reach out to support.If needed, you can increase the timeout using the N8N_RUNNERS_TASK_REQUEST_TIMEOUT environment variable.

In the runner logs, I have only this:

Starting launcher's health check server at port 5680 [0m
[launcher:py] Starting launcher goroutine...	[0m
[launcher:py] Waiting for task broker to be ready...	[0m
[launcher:js] Starting launcher goroutine...	[0m
[launcher:js] Waiting for task broker to be ready...	[0m

In the worker node I have this:

n8n Task Broker ready on 0.0.0.0, port 5679
n8n worker is now ready 
n8n worker server listening on port 5678
Worker started execution 3 (job 3) 
Worker started execution 3 (job 3) {"scopes":["scaling"],"executionId":"3","workflowId":"bKHtVsTBU2PWyOvu","jobId":"3","file":"job-processor.js","function":"processJob"}
Execution ID 3 is a partial execution. {"executionId":"3","file":"manual-execution.service.js","function":"runManually"}
Workflow execution started {"workflowId":"bKHtVsTBU2PWyOvu","file":"logger-proxy.js","function":"exports.debug"}
Executing hook (hookFunctionsPush) {"executionId":"3","pushRef":"difxefr5ig","workflowId":"bKHtVsTBU2PWyOvu","file":"execution-lifecycle-hooks.js"}
Start executing node "Code in JavaScript" {"node":"Code in JavaScript","workflowId":"bKHtVsTBU2PWyOvu","file":"logger-proxy.js","function":"exports.debug"}
Executing hook on node "Code in JavaScript" (hookFunctionsPush) {"executionId":"3","pushRef":"difxefr5ig","workflowId":"bKHtVsTBU2PWyOvu","file":"execution-lifecycle-hooks.js"}
Running node "Code in JavaScript" started {"node":"Code in JavaScript","workflowId":"bKHtVsTBU2PWyOvu","file":"logger-proxy.js","function":"exports.debug"}
Published pubsub msg: relay-execution-lifecycle-event (executionStarted) {"scopes":["scaling","pubsub"],"msg":"relay-execution-lifecycle-event","channel":"n8n.commands","type":"executionStarted","executionId":"3","file":"publisher.service.js","function":"publishCommand"}
Published pubsub msg: relay-execution-lifecycle-event (nodeExecuteBefore) {"scopes":["scaling","pubsub"],"msg":"relay-execution-lifecycle-event","channel":"n8n.commands","type":"nodeExecuteBefore","executionId":"3","file":"publisher.service.js","function":"publishCommand"}
Task request timed out
Error: Task request timed out
    at ErrorReporter.wrap (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_@[email protected]_@[email protected]_08b575bec2313d5d8a4cc75358971443/node_modules/n8n-core/src/errors/error-reporter.ts:242:37)
    at ErrorReporter.error (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/n8n-core@file+packages+core_@[email protected]_@[email protected]_08b575bec2313d5d8a4cc75358971443/node_modules/n8n-core/src/errors/error-reporter.ts:228:25)
    at LocalTaskRequester.requestExpired (/usr/local/lib/node_modules/n8n/src/task-runners/task-managers/task-requester.ts:296:22)
    at LocalTaskRequester.onMessage (/usr/local/lib/node_modules/n8n/src/task-runners/task-managers/task-requester.ts:259:10)
    at TaskBroker.handleRequestTimeout (/usr/local/lib/node_modules/n8n/src/task-runners/task-broker/task-broker.service.ts:115:50)
    at Timeout.<anonymous> (/usr/local/lib/node_modules/n8n/src/task-runners/task-broker/task-broker.service.ts:102:9)
    at listOnTimeout (node:internal/timers:588:17)
    at processTimers (node:internal/timers:523:7)
 
{
    "file": "error-reporter.js",
    "function": "defaultReport"
}

Please share your workflow

(Select the nodes on your canvas and use the keyboard shortcuts CMD+C/CTRL+C and CMD+V/CTRL+V to copy and paste the workflow.)

Share the output returned by the last node

Information on your n8n setup

  • n8n version: 1.122.2
  • Database (default: SQLite): postgresdb
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app): AWS ECS
  • Operating system:

Hello @Corneliu_Lungociu, :n8n: welcome!

Is this image:

"image": "192984850590.dkr.ecr.us-east-1.amazonaws.com/n8n-runners-custom:latest-dev",

matching the same n8n version as this one:

"image": "docker.n8n.io/n8nio/n8n:1.122.2",

One more thing, please try change the URI to:

{ "name": "N8N_RUNNERS_TASK_BROKER_URI", "value": "http://n8n:5679" },

Thank you @mohamed3nan ,

I tried changing the URI as you suggested, and I get the same behaviour.

Yes, the n8n-runners-custom is build based on n8nio/runners:1.122.2

I actually tried running directly n8nio/runners:1.122.2 without any customisations, and I still get timeout.

Here is the Dockerfile if it helps:

FROM n8nio/runners:1.122.2
USER root
RUN cd /opt/runners/task-runner-python && uv pip install numpy thefuzz us requests
COPY n8n-task-runners.json /etc/n8n-task-runners.json
USER runner

and the n8n-task-runners.json is copied from here: n8n/docker/images/runners/n8n-task-runners.json at master · n8n-io/n8n · GitHub

with this change only on the python runner:

"N8N_RUNNERS_STDLIB_ALLOW": "json",
"N8N_RUNNERS_EXTERNAL_ALLOW": "numpy,thefuzz,us,requests",
1 Like

Thanks @Corneliu_Lungociu

Can you try localhost:

{ "name": "N8N_RUNNERS_TASK_BROKER_URI", "value": "http://localhost:5679" },

and use the image n8nio/runners:1.122.2 for now until the connection work then use the custome one

3 Likes

Hey @Corneliu_Lungociu !

I would try ‘localhost:5679’ as @mohamed3nan said already ,or as in docs ‘127.0.0.1:5679’ for the value of N8N_RUNNERS_TASK_BROKER_URI…

n8n-worker:5679 is a valid url?

Cheers!

1 Like

Thank you @mohamed3nan @Parintele_Damaskin using localhost fixed the problem.

Now also the custom runner image works fine with the extra packages.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.