Help Needed: Building an automation Workflow with Complex File Handling

Hi everyone :waving_hand:,

I’m working on a complex automation use case using n8n, and I’ve hit a point where I need help from more experienced builders to take it further, especially in handling files, binary data, and integrating with AI.

:white_check_mark: What I’m Trying to Build:

An AI-powered framework to receive files with different extensions(pdf,docx, XLS, PNG, JPEG,…) and I want to handle these file submissions (I am using n8n built-in form) automatically prepare professional case reports, based on uploaded documents and assignments. The workflow follows multiple structured phases:

  1. *Receive files
  2. Extract data tasks

:outbox_tray: Current Inputs:

Through an n8n form or webhook, I receive JSON like this:

I also push metadata about the files for the AI agent — like role (plaintiff/defendant), extraction type (OCR vs summarisation), etc.

:brick: What I Have Working:

  • :white_check_mark: Form-based data collection and file intake
  • :white_check_mark: JSON cleanup and metadata normalisation using Code node
  • :white_check_mark: AI-ready file metadata for extraction tasks (e.g., OCR)

:red_circle: Where I Need Help:

  • :red_question_mark: How to properly manage and reuse binary data across nodes (esp. splitting and merging)
  • :red_question_mark: What’s the best pattern for processing multiple files per input row using binary data?
  • :red_question_mark: Anyone implemented file-type-dependent flows (e.g., PDF → Text AI, JPEG → OCR) reliably?
  • :red_question_mark: Ideas for gracefully handling large or many files in a single submission
  • :red_question_mark: Any AI integration best practices from the n8n side (OpenAI vision, Claude, etc.)
  • :red_question_mark: Should I offload heavy file processing (AI or storage) outside n8n?

:brain: Notes:

I’m using:

  • Function & Code nodes to handle logic
  • Planning to use an AI agent via GPT/OpenAI.

:speech_balloon: Any advice, example flows, or help would be appreciated.

If someone’s open to collaborating or building together — I’m open to that too :folded_hands:!

Thanks in advance!

Have you considered using a file storage system like Amazon S3 and then using n8n to extract the contents of each file and sending only the contents through the workflow?

2 Likes

Consider using a file storage for these files as the form might become an issue at scale.

I have built a relatively big OCR flow that handles 3000+ images a week. I’ve gotten quite good with Google Vertex.

Let me know if you’d want to discuss that too.

1 Like

@RBPTG Thank you,
I am now going to explore it.

@Lwanda_Mhlongo Thank you for your suggestion. I have no experience with Google Vertex. Does it require me to shift from using n8n?

No, you can do it through n8n. You feed it JPEG → It gives you the text or whatever it read in the images.

1 Like