Hello,
I’m using N8N just from 2 month.
I want to use the new gemini-2.5-pro-preview-tts. It’s avaible on Gemini Chat Model.
But I don’t find how to use it. I search on Internet, I don’t find anything, just this Gemini doc (Génération de synthèse vocale | Gemini API | Google AI for Developers).
I ask to gemini, but all the answers it give doesn’t Work.
Is there someone could help me to make this workflow?
I want to convert the text from the output of a Agent IA node on audio file and send it in Google Drive Folder.
Thanks
Hello @Alex_Blanc! Welcome!
Currently, n8n does not believe it directly supports audio generation via the gemini-2.5-pro-preview-tts
model. Although the model is available in the Gemini Chat model list in n8n, you cannot configure the responseModalities
parameter needed to get an audio response.
To use the `gemini-2.5-pro-preview-tts’ model for speech synthesis, you need to make a direct HTTP request to the Gemini API, correctly setting the required parameters as per the documentation you shared.
Set up an HTTP Request node in n8n
- Method:
POST
- URL:
https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-preview-tts:generateSpeech?key=YOUR_API_KEY
- Headers:
Content-Type: application/json
- Body (Raw JSON):
{
"text": "{{ $json.text }}",
"audioConfig": {
"speakingRate": 1.0,
"voice": {
"name": "en-US-Standard-B"
}
}
}
Replace
{{ $json.text }}
with the actual text output from your Agent node.
Decode and save the audio
The response will contain a base64-encoded audio file. Add a Function node like this:
return [{
binary: {
data: {
data: Buffer.from($json.audio.audioData, 'base64'),
mimeType: 'audio/mp3',
fileName: 'output.mp3'
}
}
}];
Upload to Google Drive
Lmk! Cheers
I got it to work by doing this:
(one thing to note is that the default output audio is a .pcm file, so you have to convert it to wav or mp3 to use it. if you’re self-hosting n8n, you can install ffmpeg into the docker container and do so, but you might need to use an external api service if on cloud)
i used you workflow. i can use google gemini tts. Is Google gemini need use ffmpeg to convert to wav file?
thank you so much
Yes, I think Gemini only returns audio files in .pcm format, so you have to somehow convert to .wav or .mp3 to use it. If you’re self-hosting n8n, I found this method to be the easiest
This is awesome! Just an FYI: your API key is exposed in the HTTP request
DO you know the body for 1 speaker?
Hi friends, recently I found this node, I test it and it works very fine, but is secure? I dont Know, but is very easy to use. It’s plug and play… What is your opinion??? https://www.npmjs.com/package/n8n-nodes-gemini-ai
Thanks for sharing, I’ve already downloaded it, it seems pretty easy, I have the same question. And, in fact, I have a project, so I need to answer this question quickly before moving forward, hehe. How can we be sure it’s safe?
I test it but im not sure… Perplexity and Gpt says: “moderate safety” not high safety because its from comunity