Issue with pgvector: keeps inserting the same document over and over again

Hi!

I’m using n8n to populate a pgvector database with content I have produced in the past (newsletters, blog posts, videos, etc.).

I have a simple workflow that reads markdown files from disk and inserts them into pgvector if they haven’t been inserted before.

When I try with 1 item (using the Limit node), everything works as expected.
However, when I remove the Limit, n8n starts to continuously insert the same documents into the vector store over and over again.

In the logs for pgvector, I see the following message repeating:

pgvector-1  | 2024-10-16 10:45:27.215 UTC [3104] ERROR:  column "collection_id" of relation "items" already exists
pgvector-1  | 2024-10-16 10:45:27.215 UTC [3104] STATEMENT:
pgvector-1  |           CREATE TABLE IF NOT EXISTS n8n_vector_collections (
pgvector-1  |             uuid uuid NOT NULL DEFAULT uuid_generate_v4() PRIMARY KEY,
pgvector-1  |             name character varying,
pgvector-1  |             cmetadata jsonb
pgvector-1  |           );
pgvector-1  |
pgvector-1  |           CREATE INDEX IF NOT EXISTS idx_n8n_vector_collections_name ON n8n_vector_collections(name);
pgvector-1  |
pgvector-1  |           ALTER TABLE items
pgvector-1  |             ADD COLUMN collection_id uuid;
pgvector-1  |
pgvector-1  |           ALTER TABLE items
pgvector-1  |             ADD CONSTRAINT items_collection_id_fkey
pgvector-1  |             FOREIGN KEY (collection_id)
pgvector-1  |             REFERENCES n8n_vector_collections(uuid)
pgvector-1  |             ON DELETE CASCADE;
pgvector-1  |
pgvector-1  | 2024-10-16 10:45:35.593 UTC [3107] ERROR:  column "collection_id" of relation "items" already exists
pgvector-1  | 2024-10-16 10:45:35.593 UTC [3107] STATEMENT:
pgvector-1  |           CREATE TABLE IF NOT EXISTS n8n_vector_collections (
pgvector-1  |             uuid uuid NOT NULL DEFAULT uuid_generate_v4() PRIMARY KEY,
pgvector-1  |             name character varying,
pgvector-1  |             cmetadata jsonb
pgvector-1  |           );
pgvector-1  |
pgvector-1  |           CREATE INDEX IF NOT EXISTS idx_n8n_vector_collections_name ON n8n_vector_collections(name);
pgvector-1  |
pgvector-1  |           ALTER TABLE items
pgvector-1  |             ADD COLUMN collection_id uuid;
pgvector-1  |
pgvector-1  |           ALTER TABLE items
pgvector-1  |             ADD CONSTRAINT items_collection_id_fkey
pgvector-1  |             FOREIGN KEY (collection_id)
pgvector-1  |             REFERENCES n8n_vector_collections(uuid)
pgvector-1  |             ON DELETE CASCADE;

I’m not sure if that is related to my workflow or a bug within n8n.
Any assistance is greatly appreciated!

Workflow

Information on your n8n setup

  • n8n version: 1.63.4
  • Database (default: SQLite): SQLite
  • n8n EXECUTIONS_PROCESS setting (default: own, main): default (unchanged)
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Docker compose
  • Operating system: macOS 15.0

Hey @xavierd , you might find the following tutorial useful, https://www.youtube.com/watch?v=PEI_ePNNfJQ. The part discussing populating the Vector database starts from ~11 min.

I had the same issue, deleting it using a query works fine for me.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.