Best way to embed Q&A Knowledge in Supabase Vector Store?

Hi there,

I organized my simple Q and A document in a way that it has multiple line breaks in between new questions and answers. I just simply would like to know what the best way of embedding q and a documents is. I used the default data loader and the recursive text splitter as you can see and wanted to enter the line breaks as seperator but it didnt seem to work. Any help would be highly appreciated :slight_smile:

Information on your n8n setup

  • n8n version: 1.95
  • Database (default: SQLite): SQLite
  • n8n EXECUTIONS_PROCESS setting (default: own, main): own, main
  • Running n8n via (Docker, npm, n8n cloud, desktop app): docker GCP
  • Operating system: Windows 11

Hello there!
I see that you’re trying to split the text by 2 newlines, but since a newline is a special “escape character”, you need to do it as an expression using JavaScript code. So, in your case you can try using the following expression in the “Separator” field: {{ " \n \n" }}. This way the newlines are treated properly and not just as regular text. Also make sure to properly configure the “Chuck Size”, which controls the max number of characters in a chunk, before the text splitter actually splits it
Hope this helps!

2 Likes

It now split it into 3 chunks but still did not trigger on some of them

this is what I entered so I dont quite get it.

What is your “Chunk Size”? If it’s set to be a large number, the text splitter might not split the text, even if it has the separator, so try decreasing the chunk size

Well but isnt that the point. The chunk should be the max. And the seperator should seperate before that. Cause otherwise I could very well just use chunk size and make it way smaller.

I had some time to test and actually can confirm that the size does matter (a phrase I didn’t think I’d say). If the chunk size is relatively small (smaller than the text I am trying to split, it starts using splitter.

For all those finding this thread years later:

the text I was splitting (just over 300 characters):

Question: What is the free skincare consultation?
Answer: It’s a 1-on-1 session with one of our skincare experts to assess your skin concerns and recommend the best solutions for you!
…
Question: How long is the consultation?
Answer: The session typically lasts 15–20 minutes and is done via video or phone call.

I first set the delimiter to ..., then:

  • If I use the chunk size eq 1000 there is no splitting.
  • If I use the chink size unset (no value) there is no splitting.
  • If I use the chunk size of 10 - it splits into two chunks.

So, to Roman’s point, I think the logic is - if the chunk size is 1000 - no need to be splitting anything in the 300 character long blob of text.

Here is my workflow if you want to repro:

1 Like

Does not work at all for me @jabbson loads it as one… doesnt care about the 3 dots when I enter 10 characters