Goal
I’m building a candidate–job matching workflow in n8n. Airtable is the source of truth for both candidates and job openings. The pipeline:
-
Build a textual summary of the job
-
Get a job embedding (Google Gemini
text-embedding-004) -
Pre-filter candidates by cosine similarity against their stored embeddings
-
Fetch detailed candidate↔skill and candidate↔language rows
-
Score with gates (MustHave/NiceToHave, languages, etc.), then write matches to an Airtable “Matching” table
The candidate embeddings are produced in a separate n8n workflow and are working fine.
Stack / versions
-
n8n Cloud
1.109.2 -
Airtable node v2.1, Code node v2, HTTP Request v4
-
Embeddings: Google Gemini
text-embedding-004 -
DB: Airtable (multiple linked tables + lookups)
What works
-
Job text → Gemini embedding
-
Candidate listing with
{Statut}='Published'and non-empty “Embedding Text (candidate)” -
Separate embedding generation workflow for candidates
What’s failing
I’m not getting any final matches out of the scoring step. Typical output:
[
{
"top": [],
"toCreate": [],
"_diag": {
"jobId": "recHAwZDxnT1WvG9Q",
"cands": 0,
"reqSkills": 0,
"reqLangs": 0,
"cSkillsRows": 0,
"cLangRows": 0,
"topK": 10,
"results": 0,
"top": 0
}
}
]
I’ve also hit a few errors while iterating:
-
Airtable filter formula
The formula for filtering records is invalid: Unknown field names: candidate record ids
(Fixed by referencing the correct lookup field name and chunking ORs.)
-
Item access
No data found for item-index: "1"on a node that expected$item(0)/ single execution. (Happens when a downstream node evaluates an expression per-item while the upstream only emitted a single item.)
-
Code node control flow
SyntaxError: Illegal continue statement: no surrounding iteration statementin a Code node (“Rank by cosine”) due to acontinueinside atry/catchthat wasn’t actually inside theforloop scope n8n uses.
Workflow outline (high level)
-
Webhook → Set Inputs (
job_id,top_k,retrieval_k) -
Airtable — Get Job (fetch job record)
-
Airtable — List Job↔Skill and List Job↔Langues (linked rows)
-
Build Job Text → HTTP — Gemini Embedding (job)
-
Airtable — List Candidates (with embedding) (published + non-empty embedding)
-
Code — Rank by cosine (score + build vector ID list + global candidate filter)
-
Code — Build chunked formulas (split OR formula into ≤30-ID chunks for Airtable)
-
Airtable — List Candidates (filtered) + List Candidate↔Skill + List Candidate↔Language
-
Merge → Function — Score & Collect TopK → (optional paraphrase) → Build Upserts → Create in Airtable
What I suspect and where I’d love guidance
-
Per-item vs run-once semantics in Code nodes
-
Best practice for iterating over
$items()and safely early-skipping invalid inputs (instead ofcontinue)? -
Reliable pattern to emit one item with a
{ filterFormula, vector_ids }payload for downstream Airtable search?
-
-
Chunked Airtable filtering
-
Recommended way to safely construct
filterByFormulawith largeOR(...)lists (I’m chunking by ~30 IDs and using a candidate lookup field likeCandidate Record IDs, thenFIND(id, ARRAYJOIN({Candidate Record IDs}))>0)? -
Any better pattern you prefer for linking “Candidats ↔ compétences/langues” rows back to candidate IDs?
-
-
Merging multi-run outputs
- Patterns for reading all runs of upstream Airtable list nodes (e.g.,
getAllFromNode+expandRecords) to avoid losing items across batches?
- Patterns for reading all runs of upstream Airtable list nodes (e.g.,
-
Defensive field access
- Sanity checks/fallbacks to avoid
undefinedwhen a field or lookup is missing (especially with multilingual field labels).
- Sanity checks/fallbacks to avoid
What I can share
I can provide:
-
Full (current) workflow JSON
-
The exact code in these nodes:
-
Code — Rank by cosine
-
Code — Build chunked formulas
-
Function — Score & Collect TopK
-
-
Raw
filterByFormulastrings used in the 3 Airtable list nodes -
Screenshots of node configs
-
Minimal JSON samples (1 job, 2–3 candidates with the
Embedding Text (candidate)stringified vector, a few Job↔Skill/Job↔Lang rows) -
Full error logs/stack traces
Questions to the community
-
Do you have a canonical snippet for a Code node that:
-
Iterates candidates from
$items(), -
Robustly parses a JSON string embedding field,
-
Emits a single
{ filterFormula, vector_ids }item for the next Airtable search, -
And never uses
continue/breakin a way that conflicts with n8n’s per-item execution?
-
-
Is the chunked OR filter approach with
ARRAYJOIN({Candidate Record IDs})the best way to keep formulas short and map back to candidate rows in the junction tables? -
Any recommended Merge strategy to reliably combine:
-
List Candidates (filtered)(possibly multiple runs due to chunking), -
List Candidate↔Skill, -
List Candidate↔Language,
so the scoring node can see all rows?
-
-
Debug tips you use for these patterns (e.g., forcing
alwaysOutputData, using$item(0)vs$json, or a utility to inspect runs)?
If it helps, I can post the current snippets and a minimal dataset. Thanks a lot for any pointers or best practices!