Hello everyone, I have deployed n8n and Ollama: Qwen3_8B locally and established a RAG workflow in n8n. However, every time I ask a question, the running time is always very long, taking about 15 minutes and the speed is about 18 tokens/s. How can I improve the speed?
Also, I would like to ask what the output items of the simple vector store node mean. Thank you.
Hello hhhhhhk, welcome to the community. Maybe i’ll try my best to answer it.
Okay the question is why it took time very long, maybe there are several points that may answer it::
Qwen 8B is heavy. If you’re not on decent GPU, it’ll crawl (like 15 mins at 18 token/s is CPU speed).
Context too big, maybe like sending too many chunks or too many tokens per question makes it worse.
The AI Agent adds tool-calling steps you dont always need.
And the next question is how to speed it up?
You can switch to a smaller model (qwen2.5-7b-instruct Q4_K_M or 3B)
Make sure the GPU is used (nvidia-smi should show load). If not, force Ollama to use GPU.
Reduce num_ctx (e.g. 4096) and num_predict (e.g. 256-512).
Tune your retriever like the chunk size is ~600-800, overlap ~100, top_k=3-5.
If you dont need fancy tool use, skip the Agent node.. just use LLM and retrieved context
Enable streaming so at least you see tokens faster.
Okay the last question is what Simple Vector Store outputs mean?
From your screenshots, the node exposes multiple ports:
Second screenshot, is to passes item through (for normal flow control)
VectorStoreModel is a handle to the vector store/retriever. You can also connect this to tools like AI Agent. It doesnt emit documents, it passes an object the downstream node uses to query.
Embeddings is to raw numeric vectors (float arrays) corresponding to input
EmbeddedDocument (sometimes labeled EmbeddDocument/Document), this is your documents augmented with their embeddings (metadata + text + vector). Use if you need to inspect what was actually stored.
Maybe thats all i can answer your question, i hope this will help you to understand it and if you have any question from that, just reply it. Always keep up the spirit