Scaling and performance

Our system was developed to automate the evaluation of handwritten exams submitted by users through a custom-built dashboard.

  1. Upload and Text Extraction Stage

The user accesses the dashboard and uploads 9 handwritten PDF pages.

This submission triggers a webhook in n8n, which starts the main workflow.

The main workflow then launches 9 parallel subworkflows, each responsible for processing one PDF page.

Each subworkflow performs handwritten text extraction using AI models (Anthropic and Gemini).

  1. User Validation Stage

After extraction, the transcribed text is displayed on the dashboard for user review.

Once the user approves the transcriptions, a new webhook is triggered in n8n.

  1. Automated Analysis Stage

This second webhook activates 5 analysis subworkflows, where 6 AI models process the extracted data stored in a database.

One of these subworkflows performs a more complex analysis, with an average runtime of 8–20 minutes (20 minutes being rare).

The other four workflows have an average runtime between 3 and 7 minutes.

  1. Current Performance

The system operates normally with up to 5 simultaneous users.

However, during stress testing with 500 simultaneous executions, the environment freezes and stops responding.

  1. Usage Projection

We estimate 3,000 to 4,000 total users per event (48-hour window).

The full automation is only active for 48 hours every 4 months.

We are now with the pro plan, but i think we to change to enterprise plan but im not sure if it will work. Someone to help me?

Hi @Alqemika_Automacao

Your system is freezing because 500 simultaneous users are triggering thousands of long-running workflows at the exact same time, which overloads your Pro plan’s resources.

Will the Enterprise Plan Fix It?

Yes, it’s designed for this. The Enterprise plan gives you dedicated, auto-scaling resources that can handle these massive spikes in usage.

However, for best results, you should also improve your workflow’s architecture.

The Best Solution: Implement a Queue

Instead of triggering all your AI subworkflows at once, use a queue to process them in a controlled way.

  1. Webhook → Writes to a Queue: Your webhook should not run the big workflows. Its only job should be to instantly add a “job” to a database table (or even Airtable) with a “pending” status. This is extremely fast and will never fail, even with thousands of requests.

  2. Scheduled Workflow → Processes the Queue: Create a second workflow that runs every minute. It will:

  • Fetch a small, manageable batch of “pending” jobs from your table (e.g., 10 at a time).

  • Run your AI processing for only that small batch.

  • Update their status to “completed.”

Why This Works:

  • No More Freezing: You control the load. The system processes a steady stream of jobs instead of a massive, crashing wave.

  • Highly Scalable: This architecture can handle thousands of users because the queue simply gets longer, but your server remains stable.

Action Plan:

  1. Contact n8n Sales: Discuss moving to the Enterprise plan for the necessary resources.

  2. Implement a Queue: Before your next event, rebuild your workflow to use this queueing model.

Combining the Enterprise plan with a queue is the standard, most reliable way to handle this level of traffic.

If you found this helpful, please mark it as the solution and give it a like .

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.