How to Correctly Map Fields and Metadata in PGVector Node?

Hi everyone,

I’m fetching data from PostgreSQL (confluence_id, content, title), generating embeddings with OpenAI (“text-embedding-3-large”, 3072 dimensions), and trying to store the embeddings, confluence_id, and title in another table using the PGVector node.

My issues:

  • confluence_id is always NULL in the target table
  • content and title are not stored as expected
  • metadata does not save confluence_id and title correctly

It’s really important for me to correctly associate the confluence_id with each chunk, because I need to keep the database up to date if a Confluence page changes. That way, I can search for chunks by their confluence_id and update them, instead of always generating new embeddings for the same page.

I’ve confirmed the vector dimensions match (3072), but I can’t solve the mapping problem.
Workflow and screenshot are attached.

Any ideas on what I might be doing wrong or how to properly map the fields in the PGVector node? Does my workflow make sense or am I missing something?

Thanks a lot for your help!





Hey @CodeDobby hope all is well,

I am assuming you want to store the title and confluence_id in metadata (since this is what you included in the set node. Given that Postgres PGVector Store only accepts columns (id, content, embedding and metadata), you don’t need to have extra columns in the table for confluence_id and title.

What you want to do is include content in the set node as well, so it is passed to the next node:

Here is what I did:

and my db looks like this (10 entries → 10 rows):

Hope it helps.

1 Like

Okay, so I’ve tried your Workflow and I do think it will work but for some reason I’m getting this error: “The ‘JSON Output’ in item 0 contains invalid JSON”

I can’t find the problem, I’ve researched and it might has to do with my content structure.
I have two separate Workflows, one for loading the paages from the Confluence Wiki into the table inside PostgreSQL and then the other one for converting into vectors.

Here is my Workflow for getting the pages into my table. I’m using the confluence REST API so i get the HTML and not the JSON, but it seems like, I need a JSON instead of HTML?

Thanks :smiling_face:

You can figure out which one is a violating field by hard coding them one by one until the problem disappears. This is how you can pinpoint which expression you need to work on. From there you can try to jsonify the string.

Take a look at the following example:

If you run it, you will see that because the input string has quotes in it, when we try to put it into a json it will throw an error, because the ‘internal’ quotes now need to be escaped.

If you look at your content it also has quotes, which is most likely what causes the issue.