Get YouTube Transcript

I am seeking assistance in converting a Python script, which retrieves YouTube transcripts without the need for authentication, into an n8n node. My initial efforts at configuring an HTTP Get node have been unsuccessful, particularly in the aspects of setting up the Google Cloud Console and configuring the node’s credentials. I’d like to understand the proper setup and configuration process for achieving the script’s functionality within the n8n platform. Below is the original Python script I wish to translate, followed by my current attempt at creating an HTTP node in n8n.

SCRIPT (Colab Syntax):

!pip install youtube_transcript_api

from youtube_transcript_api import YouTubeTranscriptApi

url = 'https://www.youtube.com/watch?v=2TL3DgIMY1g'
print(url)

video_id = url.replace('https://www.youtube.com/watch?v=', '')
print(video_id)

transcript = YouTubeTranscriptApi.get_transcript(video_id)

print(transcript)

output=''
for x in transcript:
  sentence = x['text']
  output += f' {sentence}\n'

print(output)

Here is my attempt at an HTTP node:

Console Configuration:


Current node Oauth Set up:


It looks like your topic is missing some important information. Could you provide the following if applicable.

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:
  • n8n version: latest, cloud version
  • Database (default: SQLite): none
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app): cloud
  • Operating system: macos

Well split… I am impressed! Well done indeed. Simple, straightforward, but sleek. :+1: :facepunch:

Thanks. I am still looking for help however!

Bump. If anyone can help, it would be greatly appreciated

Hi, I am trying to do quite the same:
1- enter a list of youtube channels,
2- per channel, if new videos were posted on the last day, get the transcript,
3- compile all transcripts into a google doc.

Did you manage to setup your workflow?
1 and 2 seem easier with Google Sheet AppScript, but I’d like to succeed to do it in n8n.
Best

1 Like

did you get this to work? trying to figure out exactly this right now

I’m sorry, but that part of my project is currently in limbo.

I am still searching for a solution.

Have you figured out any solution for this ? I am also looking for one

How about YouTube Subtitles + N8N - #3 by MutedJam

To help anyone who wants to do the same:
The youtube v3 captions api are able to retrieve only caption from video you own.
So don’t mess with http node oauth 2 etc.
Just go with the Python or Javascript code node, that library works.

Here is a sample of Javascript node (videoid passed from webhook):

const { YoutubeTranscript } = require('youtube-transcript');
const videoId = items[0].json.query.video;

// Crea una funzione asincrona che n8n eseguirà
async function getTranscript() {
    try {
        // Esegue la chiamata asincrona
        const transcript = await YoutubeTranscript.fetchTranscript(videoId);
        // Restituisce un array di oggetti, come richiesto da n8n
      return [{json: {transcript: transcript}}];
    } catch (error) {
        // In caso di errore, restituisce un oggetto di errore
        console.error(error);
        return [{json: {error: error.message}}];
    }
}

// Chiama la funzione e restituisce il suo valore
return getTranscript();

I assume to use this we have to install the youtube_transcript_api from github, in the local container?

Yes, correct.
Make a docker file like this:
FROM n8nio/n8n:latest
USER root
RUN npm install -g youtube-transcript
USER node

Then: sudo docker build -t n8n_youimagename

sudo docker run -it --rm --name my-n8n-instance -p 5678:5678 -v n8n_data:/home/node/.n8n n8n_youimagename

1 Like

You rock! Thanks Andrea!

I had to dig a bit deeper as I kept getting an error when using the “sudo docker build…” command. It was so easy to miss but I needed to add a period at the end:
“sudo docker build -t n8n_yourimagename .”

I also used a docker-compose.yml file to apply all the env variables I needed, when creating the container from the image. I call it a major crash course in docker!

All good now and works like a charm. Thank you so much for your guidance!

You are right and I am Sorry, in the Copy&paste i forgot the “.” and I got problema with that too! Glad I was able to help you.

I recently released a node for this: n8n-nodes-youtube-transcript

3 Likes

Correct me if I’m wrong, but unfortunately, this solution isn’t effective for n8n cloud users. I would be happy to mark the solution as resolved, but it only addresses the problem for custom installations.

Hey Franz. I get the following error:

NodeOperationError: [YoutubeTranscript] 🚨 TypeError: Cannot read properties of undefined (reading 'transcriptBodyRenderer') at Object.execute (/home/node/.n8n/nodes/node_modules/n8n-nodes-youtube-transcript/nodes/YoutubeTranscriptNode/YoutubeTranscriptNode.node.ts:90:12) at processTicksAndRejections (node:internal/process/task_queues:95:5) at Workflow.runNode (/usr/local/lib/node_modules/n8n/node_modules/n8n-workflow/dist/Workflow.js:728:19) at /usr/local/lib/node_modules/n8n/node_modules/n8n-core/dist/WorkflowExecute.js:660:53 at /usr/local/lib/node_modules/n8n/node_modules/n8n-core/dist/WorkflowExecute.js:1062:20

I will check the exception and update the package this afternoon