What is the difference between using "Google Gemini Transcribe audio" node and doing HTTP request to Google Gemini API to use generativelanguage/apiv1

enesk50 · October 20, 2025, 10:50pm

I was wondering whatthe difference is between using "Google Gemini Transcribe audio" node and doing HTTP request to Google Gemini API to use  generativelanguage/apiv1

The Google Gemini Transcribe a recording node is not working when I get an audio file from telegram, but I've seen workarounds where people make use of a http request to a generativelanguage/apiv1 (which is still in beta)

Wouter_Nigrini · October 21, 2025, 10:31am

Hi @enesk50,

If you scratch through the source code of the node, you will find the answer. I believe it is the same api calls you are trying to compare. The url the node uses can be found below:

`let url = https://generativelanguage.googleapis.com$``{endpoint};`

github.com/n8n-io/n8n

packages/%40n8n/nodes-langchain/nodes/vendors/GoogleGemini/actions/audio/transcribe.operation.ts

master


      
          
          	let contents: Content[];
          	if (inputType === 'url') {
          		const urls = this.getNodeParameter('audioUrls', i, '') as string;
          		const filesDataPromises = urls
          			.split(',')
          			.map((url) => url.trim())
          			.filter((url) => url)
          			.map(async (url) => {
          				if (url.startsWith('https://generativelanguage.googleapis.com')) {
          					const { mimeType } = (await apiRequest.call(this, 'GET', '', {
          						option: { url },
          					})) as { mimeType: string };
          					return { fileUri: url, mimeType };
          				} else {
          					const { fileContent, mimeType } = await downloadFile.call(this, url, 'audio/mpeg');
          					return await uploadFile.call(this, fileContent, mimeType);
          				}
          			});
          
          		const filesData = await Promise.all(filesDataPromises);

github.com/n8n-io/n8n

packages/%40n8n/nodes-langchain/nodes/vendors/GoogleGemini/transport/index.ts

master


      
          export async function apiRequest(
          	this: IExecuteFunctions | ILoadOptionsFunctions,
          	method: IHttpRequestMethods,
          	endpoint: string,
          	parameters?: RequestParameters,
          ) {
          	const { body, qs, option, headers } = parameters ?? {};
          
          	const credentials = await this.getCredentials<GooglePalmApiCredentials>('googlePalmApi');
          
          	let url = `https://generativelanguage.googleapis.com${endpoint}`;
          
          	if (credentials.host) {
          		url = `${credentials.host}${endpoint}`;
          	}
          
          	const options = {
          		headers,
          		method,
          		body,
          		qs,

Specifically, the nodes use the /v1beta endpoints:

github.com/n8n-io/n8n

packages/%40n8n/nodes-langchain/nodes/vendors/GoogleGemini/actions/audio/transcribe.operation.ts

master


      
          
          	const text = `Generate a transcript of the speech${
          		options.startTime ? ` from ${options.startTime as string}` : ''
          	}${options.endTime ? ` to ${options.endTime as string}` : ''}`;
          	contents[0].parts.push({ text });
          
          	const body: GenerateContentRequest = {
          		contents,
          	};
          
          	const response = (await apiRequest.call(this, 'POST', `/v1beta/${model}:generateContent`, {
          		body,
          	})) as GenerateContentResponse;
          
          	if (simplify) {
          		return response.candidates.map((candidate) => ({
          			json: candidate,
          			pairedItem: { item: i },
          		}));
          	}

Hope this answers your question

system · January 19, 2026, 10:32am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.