Analyise Videos from GDrive/Airtable? Then Write Captions

Hey everyone,

I’m looking to build a simple workflow that generates captions/copy based on images and videos for our social media/ads.

From what I’ve seen, Gemini seems to perform well when it comes to analysing images and videos.

How would I get the data from my Google Drive to send to Gemini?

Open to any suggestions or ideas, thanks guys!