Workflow for NEW gemini-2.5-pro-preview-tts

Hello,
I’m using N8N just from 2 month.
I want to use the new gemini-2.5-pro-preview-tts. It’s avaible on Gemini Chat Model.
But I don’t find how to use it. I search on Internet, I don’t find anything, just this Gemini doc (Génération de synthèse vocale  |  Gemini API  |  Google AI for Developers).
I ask to gemini, but all the answers it give doesn’t Work.
Is there someone could help me to make this workflow?
I want to convert the text from the output of a Agent IA node on audio file and send it in Google Drive Folder.
Thanks

1 Like

Hello @Alex_Blanc! Welcome!

Currently, n8n does not believe it directly supports audio generation via the gemini-2.5-pro-preview-tts model. Although the model is available in the Gemini Chat model list in n8n, you cannot configure the responseModalities parameter needed to get an audio response.

To use the `gemini-2.5-pro-preview-tts’ model for speech synthesis, you need to make a direct HTTP request to the Gemini API, correctly setting the required parameters as per the documentation you shared.

Set up an HTTP Request node in n8n

  • Method: POST
  • URL:
https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-preview-tts:generateSpeech?key=YOUR_API_KEY
  • Headers:
Content-Type: application/json
  • Body (Raw JSON):
{
  "text": "{{ $json.text }}",
  "audioConfig": {
    "speakingRate": 1.0,
    "voice": {
      "name": "en-US-Standard-B"
    }
  }
}

Replace {{ $json.text }} with the actual text output from your Agent node.

Decode and save the audio

The response will contain a base64-encoded audio file. Add a Function node like this:

return [{
  binary: {
    data: {
      data: Buffer.from($json.audio.audioData, 'base64'),
      mimeType: 'audio/mp3',
      fileName: 'output.mp3'
    }
  }
}];

Upload to Google Drive


Lmk! Cheers

2 Likes

I got it to work by doing this:
(one thing to note is that the default output audio is a .pcm file, so you have to convert it to wav or mp3 to use it. if you’re self-hosting n8n, you can install ffmpeg into the docker container and do so, but you might need to use an external api service if on cloud)

6 Likes

i used you workflow. i can use google gemini tts. Is Google gemini need use ffmpeg to convert to wav file?

thank you so much

Yes, I think Gemini only returns audio files in .pcm format, so you have to somehow convert to .wav or .mp3 to use it. If you’re self-hosting n8n, I found this method to be the easiest

This is awesome! Just an FYI: your API key is exposed in the HTTP request

1 Like

DO you know the body for 1 speaker?

Hi friends, recently I found this node, I test it and it works very fine, but is secure? I dont Know, but is very easy to use. It’s plug and play… What is your opinion??? https://www.npmjs.com/package/n8n-nodes-gemini-ai

Thanks for sharing, I’ve already downloaded it, it seems pretty easy, I have the same question. And, in fact, I have a project, so I need to answer this question quickly before moving forward, hehe. How can we be sure it’s safe?

I test it but im not sure… Perplexity and Gpt says: “moderate safety” not high safety because its from comunity