JS fails to find unique string, returns all

Describe the problem/error/question

I’m scraping a jobs website and pulling the job titles from it.

This get written to a spreadsheet - column ‘live jobs’

I want to scrape every week and update ‘live jobs’ but also populate a ‘new jobs’ column.

So I need to compare the newly scraped titles with the existing live jobs on the spreadsheet and put the new ones in ‘new jobs’.

I have the data coming from a merge node and reading the spreadsheet into a code node with the following JS.

As an example if ‘AI Agent Engineer’ is in Live Roles on the sheet and the newly scraped titles returns ‘AI Agent Engineer, Founding Engineer’ I want the script to only add ‘Founding Engineer’ to ‘New Roles’ (and keep both in Live Roles).

This JS can’t seem to pick out the unique string, it returns both roles as being new.

JS

// This code node expects two inputs:
// Input 0: From the ‘Merge’ node, containing the new job titles.
// Input 1: From the ‘Data Source1’ node, containing the existing data from the sheet.

// 1. Get the data from the two incoming connections.
// The first item is the merged data (newly scraped titles).
const currentJobsData = $input.all()[0]?.json;
// The second item is the data from the sheet (last week’s titles).
const previousJobsData = $input.all()[1]?.json;

// 2. Extract and sanitize the job titles.
// Split the comma-separated string into an array, trim whitespace, and remove empty entries.
const currentTitles = currentJobsData?.titles
? currentJobsData.titles.split(‘,’).map(t => t.trim()).filter(Boolean)
: ;

const previousTitles = previousJobsData?.[‘Live Roles’]
? previousJobsData[‘Live Roles’].split(‘,’).map(t => t.trim()).filter(Boolean)
: ;

// 3. Find the difference between the two lists.
// To ensure a robust comparison, we’ll convert all titles to lowercase for the Set lookup.
const previousTitlesSet = new Set(previousTitles.map(title => title.toLowerCase()));

// Filter the current titles to find any that are not in the previous set.
// We’ll use the lowercase version for the comparison but keep the original title in the output.
const newRoles = currentTitles.filter(title => !previousTitlesSet.has(title.toLowerCase()));

// Find the roles that were removed by comparing the lowercase versions.
const currentTitlesSet = new Set(currentTitles.map(title => title.toLowerCase()));
const removedRoles = previousTitles.filter(title => !currentTitlesSet.has(title.toLowerCase()));

// 4. Determine the output based on whether there are differences.
// The “do nothing” instruction is handled by returning an empty array, which stops the workflow for this item.
if (newRoles.length === 0 && removedRoles.length === 0) {
// If no new roles and no removed roles were found, do not pass any data to the next node.
return ;
} else {
// If there are differences, prepare the output object.
// The ‘Live Roles’ field will be overwritten with the new, full list of titles.
// The ‘New Roles Since Last Run’ field will contain only the newly added roles.
return [{
json: {
companyName: currentJobsData?.companyName || “Unknown”,
‘Live Roles’: currentTitles.join(', ‘),
‘New Roles Since Last Run’: newRoles.join(’, ‘),
‘Removed Roles Since Last Run’: removedRoles.join(’, '),
newJobCount: newRoles.length,
lastScraped: new Date().toISOString()
}
}];
}

Information on your n8n setup

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

Might suggest to use the node compareDataSet

It will help to tell if the data is in or not in the original data

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.