How to extract all Today Unread Gmail that include url and attachments

Describe the problem/error/question

Hi Support,

I’m working on an n8n workflow to extract all unread Gmail messages received today, including:

  1. The full email content

  2. All attachments

  3. Any URLs found in the message

  4. Sender and recipient email addresses

Here’s what I’d like the workflow to do:

  • Manually trigger the workflow

  • Retrieve all unread Gmail messages from today

  • Mark those emails as read after processing

  • Extract all URLs and attachments for analytics

I referenced another template that uses Python, but due to limitations, we cannot use Python in our environment. I’ve built a test flow using JavaScript, but I’m struggling to extract the attachments and content correctly.

Could anyone share ideas or examples on how to write the JavaScript code in the node to achieve this?

Below is the original Python code used in the template node.

try:
from ioc_finder import find_iocs
except ImportError:
import micropip
await micropip.install(“ioc-finder”)
from ioc_finder import find_iocs

text = _input.first().json[‘body’][‘content’]
print(text)

iocs = find_iocs(text)

return [{“json”: { “domain”: item }} for item in iocs[“urls”]]

Thanks in advance!

Regards

What is the error message (if any)?

Share the output returned by the last node

Information on your n8n setup

  • n8n version: Version 1.107.3
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Local Docker
  • Operating system: Win11

Hey, I threw this together using AI and tested it. It has more data than you probably need, but you can refactor it down to a more basic data extraction as per your needs. Let me know if you have any follow up questions and feel free to mark this as Solution if it helped you :slight_smile:

// n8n JavaScript Code Node - Gmail Message Extractor
// Extracts unread Gmail messages received today with content, attachments, URLs, and addresses

const today = new Date();
today.setHours(0, 0, 0, 0);

const outputItems = [];

// Function to extract URLs from text
function extractUrls(text) {
  if (!text) return [];
  
  const urlRegex = /https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)/g;
  const urls = text.match(urlRegex);
  return urls ? [...new Set(urls)] : []; // Remove duplicates
}

// Function to extract email addresses from text
function extractEmailAddresses(text) {
  if (!text) return [];
  
  const emailRegex = /[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/g;
  const emails = text.match(emailRegex);
  return emails ? [...new Set(emails)] : []; // Remove duplicates
}

// Process each input item
for (const item of $input.all()) {
  try {
    const emailData = item.json;
    
    // Check if email was received today
    const emailDate = new Date(emailData.date);
    emailDate.setHours(0, 0, 0, 0);
    
    if (emailDate.getTime() !== today.getTime()) {
      continue; // Skip emails not from today
    }
    
    // Extract sender and recipient information
    const senderInfo = emailData.from?.value?.[0] || {};
    const recipientInfo = emailData.to?.value || [];
    
    const senderEmail = senderInfo.address || '';
    const senderName = senderInfo.name || '';
    const recipientEmails = recipientInfo.map(recipient => recipient.address).filter(Boolean);
    
    // Get email content
    const textContent = emailData.text || '';
    const htmlContent = emailData.html || '';
    const subject = emailData.subject || '';
    
    // Extract URLs from both text and HTML content
    const urlsFromText = extractUrls(textContent);
    const urlsFromHtml = extractUrls(htmlContent);
    const allUrls = [...new Set([...urlsFromText, ...urlsFromHtml])];
    
    // Extract additional email addresses from content
    const emailsFromText = extractEmailAddresses(textContent);
    const emailsFromHtml = extractEmailAddresses(htmlContent);
    const contentEmails = [...new Set([...emailsFromText, ...emailsFromHtml])];
    
    // Process attachments (binary data)
    const attachments = [];
    if (item.binary) {
      for (const [key, binaryData] of Object.entries(item.binary)) {
        attachments.push({
          filename: binaryData.fileName || key,
          mimeType: binaryData.mimeType || 'application/octet-stream',
          size: binaryData.data ? Buffer.from(binaryData.data, 'base64').length : 0,
          binaryKey: key, // Reference to access the binary data
          fileExtension: binaryData.fileExtension || ''
        });
      }
    }
    
    // Create output item
    const extractedData = {
      // Message metadata
      messageId: emailData.messageId,
      subject: subject,
      date: emailData.date,
      receivedToday: true,
      
      // Sender information
      sender: {
        email: senderEmail,
        name: senderName,
        fullString: emailData.from?.text || ''
      },
      
      // Recipient information
      recipients: {
        primary: recipientEmails,
        count: recipientEmails.length
      },
      
      // Email content
      content: {
        text: textContent,
        html: htmlContent,
        textAsHtml: emailData.textAsHtml || '',
        hasHtml: !!htmlContent,
        hasText: !!textContent,
        wordCount: textContent.split(/\s+/).filter(word => word.length > 0).length
      },
      
      // Extracted URLs
      urls: {
        all: allUrls,
        count: allUrls.length,
        fromText: urlsFromText,
        fromHtml: urlsFromHtml
      },
      
      // All email addresses found
      emailAddresses: {
        sender: senderEmail,
        recipients: recipientEmails,
        fromContent: contentEmails,
        all: [...new Set([senderEmail, ...recipientEmails, ...contentEmails])].filter(Boolean)
      },
      
      // Attachment information
      attachments: {
        list: attachments,
        count: attachments.length,
        totalSize: attachments.reduce((sum, att) => sum + att.size, 0),
        types: [...new Set(attachments.map(att => att.mimeType))]
      },
      
      // Email headers (useful metadata)
      headers: {
        messageId: emailData.messageId,
        deliveredTo: emailData.headers?.['delivered-to'] || '',
        returnPath: emailData.headers?.['return-path'] || '',
        receivedSpf: emailData.headers?.['received-spf'] || ''
      },
      
      // Processing info
      processedAt: new Date().toISOString(),
      originalItemIndex: $input.all().indexOf(item)
    };
    
    // Add the extracted data as JSON
    const outputItem = {
      json: extractedData
    };
    
    // Preserve binary data in output
    if (item.binary && Object.keys(item.binary).length > 0) {
      outputItem.binary = item.binary;
    }
    
    outputItems.push(outputItem);
    
  } catch (error) {
    // Handle errors gracefully - add error info to output
    outputItems.push({
      json: {
        error: true,
        errorMessage: error.message,
        originalItem: item.json,
        processedAt: new Date().toISOString()
      }
    });
  }
}

// Return results
if (outputItems.length === 0) {
  return [{
    json: {
      message: 'No unread emails received today',
      processedAt: new Date().toISOString(),
      totalItemsChecked: $input.all().length
    }
  }];
}

return outputItems;

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.