I’m building a workflow in n8n to automatically extract audio from a video recording (e.g. online meeting), in order to generate a transcription and summary using OpenAI or similar AI nodes.
I already know how to handle transcription and summarization from an audio file.
My goal is to:
Receive a video file (either from a form submission or a local folder)
Extract the audio track (ideally as .mp3 or .wav)
Split the audio into chunks of max 15 minutes (to fit AI input limitations)
Send each audio chunk to an OpenAI node for transcription and summarization
The part I need help with is:
How to best extract audio from a video within n8n
How to split the audio into smaller chunks automatically
I’m open to using the Execute Command node with FFmpeg or other Docker-compatible solutions in a self-hosted environment.
What is the error message (if any)?
No error yet — I’m looking for advice on the best approach.
Please share your workflow
(I haven’t added the audio extraction nodes yet — just the AI-based transcription and summarization parts.)
As you’re self-hosting, it sounds like using ffmpeg would be a sensible idea. I’m not an expert in using it but I think you should be able to use the Execute Command node as you suggested and call it twice to: