Extracting email addresses and names from entire email

Describe the problem/error/question

Can I extract all email addresses and names from labelled emails (in headings and body of email) from Gmail? When I filter emails using the Gmail trigger, the email body does not show all emails (from previous replies etc.).

What is the error message (if any)?

None

Please share your workflow

{
“nodes”: [
{
“parameters”: {
“pollTimes”: {
“item”: [
{
“mode”: “everyX”,
“value”: 30,
“unit”: “minutes”
}
]
},
“simple”: false,
“filters”: {
“labelIds”: [
“Label_5053468288142571036”
]
},
“options”: {
“downloadAttachments”: true
}
},
“type”: “n8n-nodes-base.gmailTrigger”,
“typeVersion”: 1.2,
“position”: [
0,
0
],
“id”: “77d2efed-7f1e-45eb-98f6-a2408ed88e77”,
“name”: “Check email label”,
“credentials”: {
“gmailOAuth2”: {
“id”: “ZNhLFKO0YvO5NMjk”,
“name”: “Gmail account”
}
}
},
{
“parameters”: {
“text”: “={{ $json.headers[‘delivered-to’] }}{{ $json.headers.received }}{{ $json.headers[‘in-reply-to’] }}{{ $json.headers.from }}{{ $json.headers.subject }}{{ $json.headers.to }}{{ $json.headers.subject }}”,
“attributes”: {
“attributes”: [
{
“name”: “email address”,
“description”: “@”,
“required”: true
}
]
},
“options”: {}
},
“type”: “@n8n/n8n-nodes-langchain.informationExtractor”,
“typeVersion”: 1,
“position”: [
220,
0
],
“id”: “5e7e4d2f-c56a-4b09-81a4-5e91785aad6e”,
“name”: “Information Extractor”
},
{
“parameters”: {
“modelName”: “models/gemini-1.5-flash”,
“options”: {}
},
“type”: “@n8n/n8n-nodes-langchain.lmChatGoogleGemini”,
“typeVersion”: 1,
“position”: [
320,
220
],
“id”: “f4bd2ec8-d39b-452d-a4d0-00288939f3b7”,
“name”: “Google Gemini Chat Model”,
“credentials”: {
“googlePalmApi”: {
“id”: “yhFMxuqUdC12IXA1”,
“name”: “Google Gemini(PaLM) Api account”
}
}
}
],
“connections”: {
“Check email label”: {
“main”: [
[
{
“node”: “Information Extractor”,
“type”: “main”,
“index”: 0
}
]
]
},
“Google Gemini Chat Model”: {
“ai_languageModel”: [
[
{
“node”: “Information Extractor”,
“type”: “ai_languageModel”,
“index”: 0
}
]
]
}
},
“pinData”: {},
“meta”: {
“templateCredsSetupCompleted”: true,
“instanceId”: “211b05274c55832a4a5cceeb61357ff52da410829fe6433a269469cb2b7e19e8”
}
}

(Select the nodes on your canvas and use the keyboard shortcuts CMD+C/CTRL+C and CMD+V/CTRL+V to copy and paste the workflow.)

Share the output returned by the last node

Only a single email is extracted from an email with multiple email addresses.

Information on your n8n setup

  • n8n version: 1.81.4
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app): n8n cloud in MS Edge
  • Operating system: Windows 11
1 Like

Hey @TimOT ,
I think with the gmail trigger you get only 1 message at a time.
You can use the threadId and do a Get Thread action to get all messages:

Hi @Ventsislav_Minev,

Thank you for the idea. Unfortunately it doesn’t seem to work. Note that the thread wasn’t in my mail box, it was forwarded to me by someone else.

My idea was to take the whole email and extract all the emails and names from the email body, as this should include the entire forwarded thread. However, when I try to split out the header ‘in-reply-to’ (from Schema), where the forwarded email addresses are located, I only get the following:
In-Reply-To: <CABhEBbuiRq6DqZL9HVuoFKW=wO71scawy-OE6Q5h7eTS-+YnKw@mail.gmail.com

I tried to split out text (from Table), but then I get the whole text body of the email - the email addresses are there, but in between everything else.

Hope that makes sense. I’m scratching my head on how to get it out.

1 Like
Hi @Ventsislav_Minev,

Thank you for the advice, but unfortunately it's still not working. The email is not from an original thread in my own inbox, it is a thread forwarded to me by someone else.
I therefore thought that it should be possible to extract the email addresses from the email text body.

Email example: 
To: Me
From: X
Subject: etc.

     Further down the email body as part of the email thread forwarded to me: 
     Reply to: list of emails

The aim is to extract those reply to emails.

When splitting out the 'in-reply-to' (from Schema), I get a long code with @gmail.com instead of the list of email addresses. So I tried splitting out the email text body (from Table) - this gave me the whole text body, including the email addresses.

I'm now wondering if the addresses can somehow be extracted from that body of text.

Hope this makes sense.

Here is my current workflow
1 Like

This is my current workflow, but I’m still not getting the outcome I’m looking for:

1 Like

Not sure if you can directly split by the header or body because these are probably strings.

You will probably have to extract them with a code node with an email regex match and output them as an object/list, or you can do the same with an information extractor AI node with the sole purpose of extracting emails.

Thank you, @Ventsislav_Minev, this has given me some new direction.

Do you have an example of how to extract an email from the text? One could maybe use html instead of the email text? I’ve tried using the information extractor (seemed the easier approach for a noob), but it only delivers a single email every time.

1 Like

Hi, have you tried to pass the entire email fullHTML field to an AI and ask it to “export in an JSON array all email that you find” ?

Hi @Michel_Morelli, I just tried that and received both names and email addresses. Progress, but I now have a new problem - names and emails are separate. Is it possible to get the AI to output each name with its corresponding email address in a table?

My current work flow:

1 Like

Hi, you need to use a precise prompt. For example: “Export in a JSON array all email that you find. Field name and Field email need to be a unique field in the JSON”

1 Like

Thank you @Michel_Morelli. I’ve implemented your suggestion in my work flow below.

My Outputs are:
Text: ```json\n[\n {\n “name”: “Sample1name”,\n “emails”: [\n “Sample1email”]\n },\n {\n “name”: “Sample2name”,\n “emails”: [\n “Sample2email”\n ]\n },

JSON: “text”: "```json\n[\n {\n “name”: “Sample1name”,\n “emails”: [\n “Sample1email”\n ]\n },\n {\n “name”: “Sample2name”,\n “emails”: [\n “Sample2email”\n ]\n },

Schema: A text

This looks like it followed the instructions correctly, but I don’t know what to do with this output. I’ve tried to map the output, convert to CSV in order to export to sheets/excel, but none of those options seem to work. Am I missing something obvious here?

If it is of any use, the AI log shows the Output as follows:
[
{
“name”: “Sample1name”,
“emails”: [“Sample1email”
]
},
{
“name”: “Sample2name”,
“emails”: [
“Sample2email”
]
},

To me this looks more like a JSON, but again, I unfortunately don’t know what to do with it to get a list of names with their corresponding emails in two columns.

1 Like

Hi. what do you mean by “2 columns” ?

1 column should have the name and the second column the email. Current output seems to show name and corresponding email on a separate line, but in a single column.

Example:

Name | [email protected] |

Alternatively comma delimiter could maybe be used, then the Text To Columns function in Excel can be used to separate the names and emails into 2 columns.

Example:

Name,[email protected]
Name1,[email protected]

Something lile

Given the following text information in the email body, {{ $json.Email_text }} extract the following information as listed below.
If you cannot find the information for a specific item, then leave blank and skip to the next.
* Extract all names in {{ $json.Email_text }} followed by their corresponding email addresses (for example "name <email address>").
* Format the output as a JSON with the following structure:
[
  { "items": [
      {
        "data": "name | email"
      }
    ]
  }
]
* For each name and email pair found, add a new object with the "data" field in the format "name | email" to the "items" array.