Transcribe WhatsApp audio message using Google Gemini API

https://jamesdispatches.app.n8n.cloud/workflow/eDeuKd5Q3pZBc485

Describe the problem/error/question

I’m trying to transcribe WhatsApp audio using Google Gemini API. I filtered the message type, downloaded audio file to get the binary and used the HTTP request node for Google Gemini API to transcribe the audio. After searching from the internet I found out that I need to pass the audio file after converting it to base64. I’m using the following JSON Body to pass the data to Google Gemini API but so far I’ve been unsuccessful. Following is the JSON body:

{{
{
“contents”: [{
“parts”:[
{“text”: “Transcribe this audio”},
{“inlineData”: {
“mimeType”: audio/${$binary.data.fileExtension},
“data”: $binary.data }
}
]
}]
}
}}

I’ve also used $binary.data.data.toString(‘base64’) for “data” key and it produces the error
" Problem in node ‘HTTP Request‘

JSON parameter needs to be valid JSON"

What is the error message (if any)?

Problem in node ‘HTTP Request‘

Bad request - please check your parameters

Please share your workflow

Share the output returned by the last node

Following is the binary output returned by the last node

File Name: File.ogg

File Extension: ogg

Mime Type: audio/ogg

File Size: 4.81 kB

Information on your n8n setup

  • n8n version: 1.84.1
  • Running n8n via: n8n cloud
  • **Operating system: MAC **
2 Likes

To convert binary to base64 you can use the Extract from File node.
Also changed the Response type of Download Audio node and updated the Document in HTTP Request node.

1 Like

Thanks @Franz. That was very helpful. But when I use the JSON provided by you in the last node “HTTP Request”, an error comes which is: “JSON parameter needs to be valid JSON”

Here is the node code:

{
  "nodes": [
    {
      "parameters": {
        "method": "POST",
        "url": "=https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-pro-002:generateContent",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "googlePalmApi",
        "sendHeaders": true,
        "headerParameters": {
          "parameters": [
            {
              "name": "Content-Type",
              "value": "application/json"
            }
          ]
        },
        "sendBody": true,
        "specifyBody": "json",
        "jsonBody": "={{ JSON.stringify(\n{\n  \"contents\": [\n    {\n      \"parts\": [\n        {\n          \"text\": \"Transcribe this audio\"\n        },\n        {\n          \"inlineData\": {\n            \"mimeType\": `audio/${$binary.data.fileExtension}`,\n            \"data\": $json.base64\n          }\n        }\n      ]\n    }\n  ]\n}\n, null, 2)}}",
        "options": {}
      },
      "type": "n8n-nodes-base.httpRequest",
      "typeVersion": 4.2,
      "position": [
        900,
        -300
      ],
      "id": "68b184e6-8fc5-4447-b03b-6c587b2206ff",
      "name": "Send Audio to Gemini API",
      "credentials": {
        "whatsAppApi": {
          "id": "5OlXlHNALN1HVb6T",
          "name": "WhatsApp Account"
        },
        "googlePalmApi": {
          "id": "L3qMQ3YtqXakzCRI",
          "name": "Google Gemini(PaLM) Api account"
        }
      }
    }
  ],
  "connections": {},
  "pinData": {},
  "meta": {
    "templateCredsSetupCompleted": true,
    "instanceId": "4f4be9b3ea61ae888607e401eebcf0a3e6e2d4d4aea88503ee17db6a0e9dbb6a"
  }
}```
1 Like

You can try to generate the json document without JSON.stringify.

Or use a Set Node to generate the Json Document.

1 Like

Thank you so much @Franz. The JSON in the first HTTP request worked. Thanks, Much love <3

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.