Im scraping data from website that are in a Google Sheet to get the linkedin or email.
Finally when i get this data i dont know how to put in on the correct row i mean the output data came in randomly at the end of the mach column.
To help you solve your problem, I need a bit more info:
Could you share your workflow or paste it into the thread? In this way, I can have a clear view of what you’re trying to do
I see your INPUT is just an array of objects with a “urls” field. What do you want to do exactly with this data? Add it in a Google Sheet only if is not already existing?
Yes the idea is to add the data i have could extracted into the specific row where the information comes from the scraped url included in the row. The problem is that the items are not fixed with the data comes from.
Hi @Exes_Educacion
there’s probably something to improve in your Google Sheets node configuration:
Column to match on: Don’t use the expression editor. You should be able to select the column in the Google Sheet file. In your case, you want to select linkedin. Be aware that this field is used to understand if a new row must be created or an old one must be updated.
Value of Column to Match On: the value passed here will be matched with the corresponding value in the Google Sheet. If the value is already there, other fields will be updated, if not, a new row will be created. In your case, you want to drag here the urls field from the INPUT panel, so it will be something like {{ $json.urls }} (with the expression editor activated).
Values to Send: here you manually add the other columns in the Google Sheet that you want to update. If you don’t have any other columns (your Google Sheet has just one column), add an empty column and add it here (otherwise you’ll get an error).
Have a look at the configuration here:
I hope it helps, let me know if it works or you need more support
best
Hi @Exes_Educacion
the problem is that in this way there’s no way to understand how to match a URL with an EMAIL, and it will keep adding them below, adding new rows.
You should use the Append or Update operation in the Google Sheets Node, and use a field to match the URL and the EMAIl to understand when they belong to the same website.
A possible solution would be to pass the website information along the workflow and use it at the end as a unique field to match the URL and EMAIL.
Here’s an example:
You need to have a third column called Website that it is used to understand which EMAIL and URL belong together.
As you can see there are still some duplicates (since each web page has multiple URLs) so you should find a way to remove those.
Let me know if this helps a bit and you need more support.
An alternative way would be to merge the data in the workflow before sending them to Google Sheets.
Here’s an example workflow:
Data are cleaned with the final sequence of Set/Merge/Set/Items List Nodes.
In the Merge Node you decide which policy to use when merging the two sets on data (so play with that if you want different results).