I’m building a Q&A workflow using n8n with a Vector Store Retriever
connected to Postgres PGVector.
The user sends a JSON payload from the frontend with the following structure:
{
"question": "Is Benito Juárez's birthday considered a non-working day?",
"collection": "1952708",
"category_id": ["35724307", "35724308"]
}
So i receive the data and parse it in a Set for obtaining the question, collection and category
I want the Vector Store Retriever
to:
- Search only within the specified collection.
- Filter results by one or multiple
category_id
values, which are sent as an array.
When I pass a single value for category_id
in the Vector Store Retriever
, the filtering works perfectly.
However, when I pass two or more values (as an array), the retriever returns no results.
No error message is displayed, but the retriever returns an empty result when filtering with multiple values.
It seems that the retriever interprets the array incorrectly, as if the metadata contains category_id = a,b
rather than filtering by category_id = a
OR category_id = b
.
This is how my database looks like for metadata in embeddings:
{
"loc": {
"lines": {
"to": 28,
"from": 1
}
},
"source": "blob",
"blobType": "text/plain",
"filename": "Oficio día inhábiles Aduana de Queretaro.pdf",
"idTrafico": "1952708",
"idDocumento": "35724300"
}
My question is: How can I properly filter by multiple category_id
values in the Vector Store Retriever
? I need to select one or more categories for retriever because the final goal is:
In my frontend, the user can:
- Select one or more documents they want to query.
- These documents are associated with one or more
category_id
values. - The frontend sends the selected
category_id
values as an array along with the question, so the retriever should only search within those categories.
Here is my workflow: