Saving data as both Vectorized and clear text in the same table (Supabase, Embedding)

Zohar · June 5, 2025, 1:15pm

I’m looking for some guidance and best practices around saving information to a vector DB (supabase in this case).

For example I have this input data from a previous node:

{
  "page_id":  "1234",
  "url": "https://example.com/subject/enablement/",
  "execution_id": "970",
  "timestamp": "2025-06-05T14:48:12.563+02:00",
  "category_1": "Category 1 description",
  "category_2": "Category 2 description",
  "category_3": "Category 3 description"
}

My DB table has both regular columns as well as several vector columns.
I’d like each category to be saved both as TEXT as well as VECTOR.

Questions:

Should I have two separate supabase nodes - one for each operation?
Am I supposed to have a separate embedding node for each Category field?

Hashlogics · June 5, 2025, 2:25pm

You’re on the right track, Zohar - saving both raw text and its vector representation is a solid pattern, especially when you want traceability or future metadata-based filtering alongside similarity search.

Best Practices for Your Setup:

1. Store Clear Text and Vectors in the Same Table

Yes, you can absolutely store the text fields (category_1, category_2, etc.) as TEXT columns and their embeddings in dedicated VECTOR columns in the same Supabase table - that’s a common hybrid RAG approach.

Your schema might look like this:

CREATE TABLE categories (
  id uuid PRIMARY KEY,
  page_id TEXT,
  url TEXT,
  execution_id TEXT,
  timestamp TIMESTAMPTZ,
  category_1 TEXT,
  category_1_vector VECTOR(1536),
  category_2 TEXT,
  category_2_vector VECTOR(1536),
  category_3 TEXT,
  category_3_vector VECTOR(1536)
);

2. Embedding: One Node per Category Field

If you’re using an embedding model (like OpenAI, Cohere, etc.) inside n8n or Make:

You do need a separate call per field unless you’re batching them (and can later map results properly).
Many APIs let you send multiple strings at once, but the response is typically an array — make sure you’re aligning output correctly (e.g. response[0] → category_1_vector).

So you can:

Use a loop (or series of API calls) if you’re going sequential
Or use a batch call + mapping logic if your platform allows that cleanly

3. Supabase Write: Single Node or Split

If you’re inserting/upserting everything at once and all values are already computed (text + embeddings), one Supabase node is fine. Just construct the full payload.

But if you’re embedding fields asynchronously (e.g., spaced across nodes or delayed), you may need:

A staging object to collect data
Or a multi-step flow where the final node writes the full record

Let me know if you want to see a visual (like an n8n workflow or SQL example). Also happy to share how we do this internally at Hashlogics for RAG apps that use hybrid filtering and vector search in Supabase.

Feel free to connect: Calendly

Zohar · June 5, 2025, 3:39pm

@Hashlogics thanks a lot for the detailed answer!

I don’t really mind if I need to do several requests if that is the way to go.
Does this look right to you?

Zohar · June 5, 2025, 3:48pm

While we’re at it, let me ask - is it possible in n8n to run the embedding model independently of any DB/vector store?
i.e. - vectorize the input in a node and use it later to insert to the DB, instead of having it all done as an atomic operation in a single [node+embedding-model+text-splitter] combo?

Hashlogics · June 17, 2025, 12:02pm

Yes, you can absolutely run the embedding independently of the DB interaction in n8n. We’ve done this frequently when we want more control over when and how the data is stored:

Embedding as a standalone step
You can run your input through the embedding model (e.g., OpenAI’s /embeddings endpoint) and simply store the resulting vector in a variable or a Set node. That gives you flexibility to reuse or batch it downstream without tying it directly to a DB action.
Staging the vector
If you’re working with multiple embedding nodes (like in your case), you can stage each output into a Merge node or custom data object before the final Supabase write. That way you maintain control over when the insert happens, and you can structure the full payload precisely.
Separation is helpful for debugging and performance
We often recommend splitting these steps, especially when testing or troubleshooting vector logic, because it keeps your n8n execution trace cleaner and more debuggable.