I am working with the attached workflow, which basically should extract data from a CSV-file that has been received via Gmail and structures it into a format that I need. So this means the files coming in come in various ways, with different column naming conventions. The AI Model should basically map the columns with the ones that are used in the database (e.g. Airtable) and import it into Airtable by adding it to already existing rows subsequently.
One challenge I am facing now is that the data I am importing is having almost 5,000 rows and some files might potentially have even more. The model takes ages until it processes the file, so hence think this will not be the best way to proceed. Do you have any advice?
You want to store the 5000 rows in a database or spreadsheet and add a ‘status’ field. Then, based on the status field you can process smaller batches and change their status from ‘new’ to ‘done’ to avoid processing them twice. For data-heavy workflows, we also recommend using subworkflows to do the heavy lifting on your data as they’ll release their memory when done.