Is there any for finding metadata like video length, audio length, and if file type is pdf, docs or slides, should extract total page numbers

Himanshu_Rana · May 16, 2025, 5:06am

• Identify content type (Video, Audio, Memo, Slides, etc.)
• Calculate duration or page count
• If audio/video, auto-transcribe the content

I need suitable node for above scenario?
If anyone has idea…

Erick_Torres · May 16, 2025, 5:08am

You’re working on a great use case! Here’s how you can break it down in n8n using the right tools or APIs:

1. Detect File Type

Use the Function node to check the MIME type or file extension (.pdf, .mp3, .mp4, .docx, etc.).
Or use file-type logic in a custom Function.

2. Extract Metadata

PDF / Docs / Slides

Use a Custom Function node with libraries like pdf-parse (if self-hosted) to get page count.
Or upload file to an external API like:
- PDF.co
- Cloudmersive Document and Media API
- [Google Drive API](for Slides/Docs)

Audio

Use ffprobe if you’re running self-hosted with exec (via Execute Command Node).
Or use AssemblyAI or Whisper API for both metadata + transcription.

Video

Same as audio — ffprobe is the best for duration, codec, etc.

3. Transcribe Audio/Video

Let me know if you want a sample workflow (e.g. get video length + transcribe it). You’ll likely need a mix of:

HTTP Request node
Function node
External API

Hope that helps you get started!

Himanshu_Rana · May 16, 2025, 5:11am

Thank you so much… Let me test out your suggestion,
one doubt : Here function node, you mean, code node right?
If I find any issue… May I ping you back?

Erick_Torres · May 16, 2025, 5:16am

You’re very welcome!

Yes — by Function node, I mean the Code node (which used to be called Function in older versions of n8n). It allows you to write custom JavaScript, perfect for checking file types or extracting metadata if you’re handling it manually.

And absolutely — feel free to ping me back if you run into any issues. Happy to help!

Himanshu_Rana · May 16, 2025, 5:21am

Sorry to bother you again,
but there is not ‘ffprobe’ node. (I might need to use Http request node) but if that’s the case I could use whisper API (for both) rather than making 2 API calls…

And if you could provide sample workflow, that would be awesome. Because I’m confused about passing that file as input to code node.

Erick_Torres · May 16, 2025, 5:34am

Sure! Here’s a simple example workflow to help you extract metadata using a Code node, especially if you decide not to use ffprobe.

Scenario

Let’s say you upload a file (PDF, MP3, etc.), and want to:

Identify its type
Count pages (for PDF) or duration (for audio)
Use that info in the next nodes

Sample Flow (Outline)

Webhook (File Upload Trigger)
Receives a file from a form or app.
HTTP Request (Optional)
Sends the file to an external API (like Whisper for audio transcription or pdfinfo/libmagic tools via API).
Code Node (Metadata Extract)
Use the file’s buffer or metadata to detect file type and pull basic info.

Passing File to Code Node

The file will be in binary format. In your Code Node, you can access it like this:

const file = $binary["your_file_field_name"];
const fileName = $json["your_file_field_name"]?.fileName || "unknown";

return [
  {
    json: {
      fileName,
      fileSize: file?.data?.length,
      mimeType: file?.mimeType,
    },
    binary: {
      data: file,
    },
  },
];

Bonus

If using Whisper, you can POST audio directly using the HTTP node and get transcript + duration.
For PDFs, APIs like PDF.co or PDFParser return page count and metadata.
If you’re self-hosting, consider installing CLI tools (ffprobe, exiftool) and using Execute Command node.

Let me know which file type you’ll work with first — I can help tailor the flow for that.

Himanshu_Rana · May 16, 2025, 5:41am

Pardon, I’m a bit confused,
Below is my workflow’s JSON.
Can you edit that to use the file in the code node?

Erick_Torres · May 16, 2025, 5:53am

Perfect! Based on the workflow you’ve shared, here’s a quick response you can post to help Himanshu_Rana understand how to pass the file into the Code node directly:

Thanks for sharing your workflow! Based on your setup, here’s how you can access the uploaded file in the Code node:

Updated Code Node Snippet

Assuming the file is still available in binary form (uploaded via the Form trigger), here’s a sample to include in the Code node:

const binaryKey = Object.keys($binary)[0]; // Automatically picks the first uploaded file
const fileData = $binary[binaryKey];
const fileName = fileData?.fileName || "unknown";
const mimeType = fileData?.mimeType || "unknown";

return [
  {
    json: {
      fileName,
      mimeType,
      fileSize: fileData?.data?.length,
    },
    binary: {
      data: fileData
    }
  }
];

This will:

Extract the file from the current binary input
Return some basic metadata (name, size, MIME type)
Pass the binary along to the next node if needed

Tips:

If you routed by extension in the “Route file” node, make sure that route still keeps the binary property — otherwise, re-attach it using a Merge node or restructure the logic.
You can always use console.log(Object.keys($binary)) to debug what binary fields are available.

Let me know what kind of file you’re testing with (PDF, MP3, etc.) and I’ll help tailor the logic for that format.

Himanshu_Rana · May 16, 2025, 7:23am

No it’s not helping, I’m not able to pass that to the code