Is this usecase possible with n8n

Describe the problem/error/question

This is my Day 1 on n8n. I want to check with the community if my usecase that I would like to solve using n8n is even feasible before I spend too much time on it.

so, I have a requirement perform a sanity check on some data files that get dropped into various storage locations like azure blob or local storage accessible by UNC path. I would like to perform sanity check on those files like looking for dupes, nulls , bad characters etc. before they get into actual data processing pipelines (which will cause failure). file structure info etc. stored in a sql table that need to be read to validate the file.

So, I’m looking for

  1. ability to monitor storage for new files
  2. read the file and identify bad data
  3. move the file to some failedfiles folder

Could be very basic ?

What is the error message (if any)?

Please share your workflow

(Select the nodes on your canvas and use the keyboard shortcuts CMD+C/CTRL+C and CMD+V/CTRL+V to copy and paste the workflow.)

Share the output returned by the last node

Information on your n8n setup

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

1+3. possible, but dependinh on which storage you are planning to monitor, can be easy or not
2. very much depends on the file sizes and definition of bad data and how computationally expensive this is going to be.

1 Like

Thank you @jabbson and @Bringasher . I’m on it then. :slight_smile:

file sizes around 5 gb maximum. typically comma/tab/pipe separated. Depending on the landing zone of the file , they file may have different column structure and constraints like it can not be null etc.. so, I will need to read the definition from a SqlServer based on the landing zone of the file.

Dealing with files of this size almost never makes sense. If you are trying to process files of this size directly in n8n it will almost certainly going to be a problem, N8N is not a data engine, it is more of an orchestration pipeline, heavy computations or dealing with files large in size…you probably would want to explore other possibilities,

1 Like

Thank you for valuable insight. we do have some python code that does the same thing but trying to get onto the Agentic AI bandwagon. May I can invoke that code from n8n ?

Sure, you can totally invoke code from n8n, for that you will need to either host the code somwhere, a service which allows executing python code as a function (lambda, gcp cloud function) OR you can create a simple api wrapper around your function and self-host it anywhere you want.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.