Node: Gemini multimodal GenAI (Vertex AI)

It would help if there was a node for:

Google Gemini multimodal (Vertex AI)

My use case:

TL;DR: Cheaper, seemingly faster, and possibly better than OpenAI’s AnalyzeImage node.

N8N already has a node for Open AI GPT4 Vision API, called “OpenAI - Analyze Image”. It was released recently, possibly following a request in Please add support of the new OpenAI features [done] - #26 by tomtom

I did a few comparison between OpenAI and Google for the same multimodal use case: image + prompt. Gemini didn’t do bad at all. It seems faster (comparing the Google console with the N8N node, not really a fair comparison) and better creative results (i.e. my impressions, not a fair comparison either)
The biggest difference is the pricing: for an image+prompt, Gemini is 4X cheaper (based on an image of roughly 600x600). Google’s pricing is flat per image, while OpenAI’s pricing is proportional to the image size.

I therefore think that the Google-based node could be more popular than the OpenAI-based node. The UI and parameters for the Google node (prompt + image URL) could be the same as for the OpenAI node.

Any resources to support this?

Vertex AI has a sandbox in the Google cloud console
API docs are at Google Cloud console
Pricing at Pricing  |  Generative AI on Vertex AI  |  Google Cloud

I understood that Vertex AI is the name for the GenAI multimodal API. PaLM only does text in inputs and outputs. The model running inside Vertex AI I could test is called “gemini-1.0-pro-vision-001”.

I decline any responsibility for the fact that Google could, at any time, rename their models and product in the most confusing way possible :slight_smile:

Are you willing to work on this?

I can make a fork in my workflow and help testing that requested node against the currently available OpenAIanalyzeImage node.

Hello, Did anyone looked at this yet ?

I didn’t hear back after that request. I guess it needs enough upvotes to be considered?

I found out that n8n must be aware of it, since there’s a landing page optimized for gemini and Vertex AI keywords, but seemingly nothing concrete behind: Google Vertex AI integrations | Workflow automation with n8n