Best Practices for Handling Large Excel Files on OneDrive with a Self-Hosted Instance?
Hello everyone,
I'm looking for some guidance on the best practices for handling a common but challenging scenario: running an incremental update workflow that involves a large Excel file hosted on OneDrive for Business.
My Goal:
I have a workflow that needs to run several times a day. The logic is:
- Determine the timestamp of the last successful run.
- Query a Microsoft SQL Server database to get only the new records created since that timestamp.
- Process these new records (grouping, filtering, etc.).
- Add these new processed records to a master Excel file on OneDrive, preserving all the historical data.
The Problem:
My master Excel file is already large (~7-10 MB) and will continue to grow. I'm constantly running into Microsoft Graph API limitations, and I'm struggling to find a robust pattern that doesn't fail.
I have tried using the Microsoft Excel node, but any operation that needs to read the file's content (like Get Rows or even trying to get a list of the Tables in the sheet) fails with various errors, most commonly:
400 - ResponsePayloadSizeLimitExceeded404 - Not Foundwhen trying to select a Table, likely due to a timeout while parsing the large file.Conversion failed when converting date/time...when passing date strings that the API doesn't like.
I understand that the recommended approach for large files is to avoid the Microsoft Excel node and instead use the OneDrive node (to Download/Upload) in combination with the Extract from File / Convert to File nodes.
My Question:
Given that I am on a self-hosted n8n instance and will always be working with large datasets, what is the definitive, community-approved "best practice" for this incremental update loop?
Specifically:
- For Reading/Getting the Timestamp: Should I use the
OneDrive (Get)operation to fetch the file'slastModifiedDateTimeproperty? Or is it better toDownloadthe entire file and useExtract from Fileto read a specific column and find the max date? - For Writing/Appending: Since I can't use the
Microsoft Excel (Append Rows)node due to the file size, is the correct pattern to always:- Download the full original file.
- Read it with
Extract from File. - Get new data from SQL.
- Merge the old and new data together inside the workflow.
- Use
Convert to Fileto create a new, complete Excel file. - Use
OneDrive (Upload)to overwrite the old file.
This "read-all, merge, write-all" pattern seems like the most robust way to avoid the API errors, but I want to confirm if this is the standard approach or if there's a more efficient way to handle this on a self-hosted instance.
My Setup:
n8n Version: 1.110.1
Hosting: Self-Hosted
Database: Microsoft SQL Server
File Storage: OneDrive for Business (SharePoint)
Thanks so much for sharing your expertise and any example workflows you might have!