Best way to handle Image Generation with Gemini 1.5 / GPT-4o in n8n?

I’m looking for the cleanest way to integrate Gemini 1.5 (Google AI Studio) image generation into my n8n workflow.

My Goals:

  • Trigger image generation via API from text prompts.

  • Keep the logic scalable (handling binary vs. Base64).

  • Use the HTTP Request node (unless a specialized node is significantly better).

Specific Questions:

  1. Node Choice: Is there a reliable community node (like n8n-nodes-nano-banana) you’d recommend, or is the standard HTTP Request node still the gold standard for flexibility?

  2. Handling Output: What’s your preferred way to convert the API’s Base64 response back into a usable n8n binary file?

  3. Any advice on auth (API Key vs OAuth) or managing rate limits when scaling?

If anyone has a workflow snippet or a JSON body example for the request, I’d love to see it!

@latoqumu A friend shared a workflow with me that handles the exact setup you’re looking for. It’s a very clean HTTP Request + Binary Conversion pattern.

What it does:

API Call: Uses an HTTP Request node to POST to /v1/images/generations using the gpt-image-1.5 model.

Binary Handling: It automates the “annoying part” converting the b64_json string from the API response into a proper n8n binary file.

You can download the workflow JSON here:

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.