Problem with Compare Datasets

Describe the problem/error/question

Hi, what im trying to do is the following. A human recieves a PDF, opens it and check that 6 fields match with corresponding columns in a spreasheet (there is an id match to determine the row). If 6 fields data match, human procedes to send an email from a template using some of those fields, and attaching the PDF.

On the automation side, the workflow triggers when a new file is added to folder which is then downloaded to n8n and uploaded to LlamaPArse to extract info in markdown. Then i pass the mardown to the information extractor agent which extracts the 7 fields i need. One i use it to bring the google sheet row i need. After some data transformation on both branches (sheet and info extracted from pdf) i feed a Compare Datasets with the following.

[
{
"Nombre de la campaña": "AAA",
"Nombre de talento": "BBB",
"Dirección de mail": "[email protected]",
"Condición de pago": 60,
"Empresa": "DDD",
"Monto total sin IVA": 1800000
}
]
[
{
"Pais NEW": "DDD",
"Monto NEW": 1800000,
"Acción": "AAA",
"Influencer": "BBB",
"Dias": 60,
"Contacto": "[email protected]"
}
]

From this compare dataset node, i expected to branch the workflow. If there are cero errors, then proceed with email. If there is some kind of difference, then go back to pdf generator (possible by email remarking the differences).

Thing is it outputs the following

In A only Branch (1 item)

[

{

"Nombre de la campaña": "AAA",

"Nombre de talento": "BBB",

"Dirección de mail": "[email protected]",

"Condición de pago": 60,

"Empresa": "DDD",

"Monto total sin IVA": 1800000

}

]

Same Branch
No output data in this branch

Different Branch
No output data in this branch

In B only Branch (1 item)

[

{

"Pais NEW": "DDD",

"Monto NEW": 1800000,

"Acción": "AAA",

"Influencer": "BBB",

"Dias": 60,

"Contacto": "CCC@CCC"

}

]

What is the error message (if any)?

Im getting no error message, compare datasets is not working as i understand it should

Please share your workflow

I pasted it here buy revealed to much company data si id rather not post it if not needed.

Share the output returned by the last node

Information on your n8n setup

  • n8n version: N8N cloud latest beta stable
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

How are you comparing the data? Still it is not clear for me. For programming perspective, if I am comparing 2 data like you provided Brach B and Branch A, I would try to find same keys first, then I would check the value of that keys.

To make it clear:
1- Is this PDF file formats are same?
2- You said you are formatting extracted data from PDF, why do you have different keys for both branch?
3- Please share your workflow. and if there is any code for parsing or structuring your data, please share that too. You can do it without revealing your company data.

I get everything in “in A only” and everything “in B only” i really dont undestand what the problem is.

In regards to your second question, key are different. It has same “name” but a bit different. I could process info to be same keys but in compare databases i match them manually.

1 Like

Looks like problem is table column names. It is not just comparing the value. It is also comparing key names too. If you change dataset 3 field names to fruit, you will see that, it doesn’t match the 3rd item. Check the example:

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.