Vector store embeddings context length

I have workflow for embedding md files to my qdrant vector store

For embedding i’m using ollama with qwen3-embedding:0.6b that has context of 32K

This is my workflow:

qdrant collection settings:

{
    "vectors": {
        "size": 1024,
        "distance": "Cosine",
        "on_disk": true
    },
    "quantization_config": {
        "scalar": {
            "type": "int8",
            "always_ram": true
        }
    }
}

token Splitter is set to 6000 and my model context is 32K, and i don’t understand why I’m getting “input exceeds maximum context length”

Also in Token Splitter I’m not sure what I’m defining. In n8n description is “Split text into chunks BY tokens” so i in my understanding 6000 is tokens that I define, but in node settings I’m setting “Chunk Size”

Information on your n8n setup

  • n8n version: 1.113.3

The Token Splitter node splits the text into chunks of the specified token size. If your Markdown file is very large, it will be split into multiple 6000-token chunks. If you then process all these chunks at once (for example, by sending them all in a single request to the embedding model), you could exceed the model’s context window. Make sure each chunk is processed individually by the embedding node, not as a batch.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.