Retrieve Vector ID After Document Upsert in Pinecone Integration

The idea is:

The Pinecone integration node in n8n should return the vector ID after a document is upserted into a Pinecone index. Currently, the node only returns the metadata and pageContent fields, making it difficult for users to track and reference the vector ID in future operations.

My use case:

When inserting documents into a Pinecone index via the Pinecone integration in n8n, I need to track the vector IDs for future operations such as updates, deletions, or retrievals. However, the node does not return the vector ID after the upsert, making it challenging to manage the data effectively.

I think it would be beneficial to add this because:

This feature will solve the problem of tracking vector IDs after upserts, allowing for more efficient management of data within the Pinecone index. Without the vector ID, users are limited in their ability to reference the inserted data in future workflows.

Any resources to support this?

Link to Pinecone API Documentation on Upserts

Are you willing to work on this?

No

I think this is an important feature to work with all of the vector stores provided by n8n. When adding embeddings for a document to the vectorstore you will not be able to update embedding when the original document changes. I looked at the source code and it seems that the response from the vector store population is ignored when it succeeds (n8n/packages/@n8n/nodes-langchain/nodes/vector_store/shared/createVectorStoreNode.ts at master · n8n-io/n8n · GitHub). I think it would be nice to be able to optionally set the ID of a row in the vector store and to return the ID of the rows after creation. I can help to create a PR but I would at first get some feedback from the core-team and discuss a general strategy on how to better interact with vector stores.

3 Likes

Just to echo that this is vital - without a way to retrieve the vector IDs there’s no way to manage updates / deletions and so on

2 Likes

As a workaround, I’ve ended up sending a vector query with 3,072 0’s so that the vector search will return everything. (I’m using text-embedding large, so if you’re using a different embedding then you’ll need to do a different number of 0s)

1 Like

Agreed this is a problem. Specifically because you can’t search Pinecone by metadata as those metadata fields would need to be indexed in Pinecone - so you can’t really create your own ID

1 Like

Ditto, the pinecone node to update records is impossible to use as there’s no way to get the pinecone ID.

Workaround to delete the store is crazy and impractical - we have thousands of records.

We need to be able to set the _ID when inserting a record, or implement the upsert method. Note that setting a meta key of ID or _ID does not work, Pinecone just adds this as meta and still creates it’s own unique ID.

The only current solution as proposed on another post is to manually make an http post to pinecone but using this method how can I create the embeddings if not using the pinecone node?

Seem to be stuck between a rock and a hard place with this.

1 Like

same problem here, commenting so it does not gets automatically closed.