I’m running into an issue with the Google Text-to-Speech API when using the synthesizeLongAudio endpoint and would appreciate some help debugging it.
The Goal:
I have a workflow that generates several chapter summaries using an AI node. Each summary, formatted with markup tags (like [pause long]), is then sent to the Google TTS API to create a long audio file.
The Problem:
The process starts, but when my “Poll_TTS_Status6” node checks the job status, it returns an error for every item:
code: 3, message: This voice does not support empty input.
This happens even though the input node (TTS_Start_Job7) seems to be receiving the generated text correctly for each item.
Any help would be greatly appreciated, thank you !
I’ve pretty much solved the original issue and the voice overs are being output. The only issue remaining is the [pause long] tags.
In my script, I’ve added a special instruction to ‘pause for 3 seconds’ right after the chapter title.
Everything in my workflow runs perfectly, and I get an audio file at the end.
The Problem: The voice in the recording doesn’t pause. It just reads the title and then immediately starts reading the summary all in one go, ignoring my “pause” instruction.
“You can insert pauses into AI-generated speech by embedding special tags directly into your text using the markup input field. Note that pause tags will only work in the markup field, and not in the text field."
The available pause tags are [pause short], [pause long], and [pause]. For alternative methods of creating pauses without using markup tags :
Like using commas, eclipses, hyphen… Worth to try combine, since there isn’t a pause with certain timeout.