Describe the problem/error/question
I’m running n8n locally through Docker Desktop for Windows. I have a workflow that ingests into a Qdrant Vector Store through HTTP Requests. The workflow in question receives the following items: pageContent, fileName, filePath. I then split into batches of 1,000 (which my system can handle). I send each batch to a code node, where I chunk the data and return them:
chunks.push({
json: {
pageContent: chunkText,
metadata: {
file_name: fileName,
file_path: filePath,
item_number: globalOffset + index,
chunk_index: chunkIndex,
type: type
},
id: uniqueId
}
})
I then aggregate the pageContent for each chunk into an array of chunks of text, and get it ready to send to an HTTP Request for embeddings:
const allContent = items.map(item => item.json.pageContent);
return {
model: “mxbai-embed-large”,
input: allContent
};
The HTTP Request hits my ollama container with a post request /api/embed.
I get embeddings for all 1,000 chunks of text, and then I have another code node which points each of those embeddings to their respective chunks, and gets the data ready for ingestion:
const chunks = $(“DocumentLoader & TextSplitter”).all();
const batchResult = $(“Embedding”).first().json.embeddings;
if (chunks.length !== batchResult.length) {
throw new Error(Mismatch! Chunks: ${chunks.length}, Vectors: ${batchResult.length});
}
const points = chunks.map((chunk, i) => {
return {
id: chunk.json.id,
vector: batchResult[i],
payload: {
pageContent: chunk.json.pageContent,
…chunk.json.metadata
}
};
});
return {
points: points
};
I then send an HTTP request to my Qdrant container, to create the points.
This all works fine, but the issue is that for larger files, I get an ‘Invalid string length’ error in my n8n container. The process works for the first ~20 batches, and then throws that error and stalls (the workflow itself doesn’t error). I have tried selecting ‘Do not save’ for execution progress (success and failure), reducing batch size, and adding certain environment variables to my compose file. None of these have worked so far. The error comes after the workflow has created 40,000 points and is still executing. I think there is some string that is being created somewhere with all of the data that is passed through. I’m not sure how to stop n8n from doing that. One solution I just though of was making everything after the batch a separate sub-workflow, and having that sub-workflow return nothing. So that sub-workflow gets called multiple times (in this case 20+) but it doesn’t hold onto the data from each execution in a big string. Is this a good solution and would it work?
What is the error message (if any)?
Please share your workflow
Share the output returned by the last node
Information on your n8n setup
- n8n version:
- Database (default: SQLite):
- n8n EXECUTIONS_PROCESS setting (default: own, main):
- Running n8n via (Docker, npm, n8n cloud, desktop app):
- Operating system: