Workflow for NEW gemini-2.5-pro-preview-tts

Alex_Blanc · May 26, 2025, 7:28am

Hello,
I’m using N8N just from 2 month.
I want to use the new gemini-2.5-pro-preview-tts. It’s avaible on Gemini Chat Model.
But I don’t find how to use it. I search on Internet, I don’t find anything, just this Gemini doc (Génération de synthèse vocale | Gemini API | Google AI for Developers).
I ask to gemini, but all the answers it give doesn’t Work.
Is there someone could help me to make this workflow?
I want to convert the text from the output of a Agent IA node on audio file and send it in Google Drive Folder.
Thanks

Gallo_AIA · May 26, 2025, 8:16am

Hello @Alex_Blanc! Welcome!

Currently, n8n does not believe it directly supports audio generation via the gemini-2.5-pro-preview-tts model. Although the model is available in the Gemini Chat model list in n8n, you cannot configure the responseModalities parameter needed to get an audio response.

To use the `gemini-2.5-pro-preview-tts’ model for speech synthesis, you need to make a direct HTTP request to the Gemini API, correctly setting the required parameters as per the documentation you shared.

Set up an HTTP Request node in n8n

Method: POST
URL:

https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-preview-tts:generateSpeech?key=YOUR_API_KEY

Headers:

Content-Type: application/json

Body (Raw JSON):

{
  "text": "{{ $json.text }}",
  "audioConfig": {
    "speakingRate": 1.0,
    "voice": {
      "name": "en-US-Standard-B"
    }
  }
}

Replace {{ $json.text }} with the actual text output from your Agent node.

Decode and save the audio

The response will contain a base64-encoded audio file. Add a Function node like this:

return [{
  binary: {
    data: {
      data: Buffer.from($json.audio.audioData, 'base64'),
      mimeType: 'audio/mp3',
      fileName: 'output.mp3'
    }
  }
}];

Upload to Google Drive

Lmk! Cheers

John_Song · May 29, 2025, 6:08am

I got it to work by doing this:
(one thing to note is that the default output audio is a .pcm file, so you have to convert it to wav or mp3 to use it. if you’re self-hosting n8n, you can install ffmpeg into the docker container and do so, but you might need to use an external api service if on cloud)

tytom2003 · June 10, 2025, 8:59am

i used you workflow. i can use google gemini tts. Is Google gemini need use ffmpeg to convert to wav file?

follow-prince · June 13, 2025, 7:42am

thank you so much

John_Song · June 13, 2025, 12:23pm

Yes, I think Gemini only returns audio files in .pcm format, so you have to somehow convert to .wav or .mp3 to use it. If you’re self-hosting n8n, I found this method to be the easiest

Sam_Smith · July 13, 2025, 10:48am

This is awesome! Just an FYI: your API key is exposed in the HTTP request

Futurebillionaire · August 27, 2025, 9:20am

DO you know the body for 1 speaker?

Abraham_Parada · September 29, 2025, 2:01pm

Hi friends, recently I found this node, I test it and it works very fine, but is secure? I dont Know, but is very easy to use. It’s plug and play… What is your opinion??? https://www.npmjs.com/package/n8n-nodes-gemini-ai

Abdiel_Garcia · September 30, 2025, 4:54pm

Thanks for sharing, I’ve already downloaded it, it seems pretty easy, I have the same question. And, in fact, I have a project, so I need to answer this question quickly before moving forward, hehe. How can we be sure it’s safe?

Abraham_Parada · September 30, 2025, 7:46pm

I test it but im not sure… Perplexity and Gpt says: “moderate safety” not high safety because its from comunity