How to Build a Multi-Stage AI Content Pipeline with n8n + Ollama

unstableentity · March 25, 2026, 7:01am

I’ve been experimenting with using Ollama for content generation in n8n, and found that chaining multiple focused AI calls produces much better results than a single big prompt.

Here’s the approach — a 4-stage pipeline where each node refines the previous output:

The Pipeline

Stage 1: Research → Generate key points about the topic (temperature: 0.7)
Stage 2: Outline → Structure those points into a logical flow (temperature: 0.6)
Stage 3: Draft → Write the full content (temperature: 0.8)
Stage 4: Edit → Polish grammar, flow, and clarity (temperature: 0.3)

Why Multi-Stage Works Better

When you ask an LLM to “write a blog post about X” in one shot, it tends to:

Rush through important points
Produce shallow content
Miss logical structure

By breaking it into stages, each call has a focused job and builds on quality input from the previous stage.

Quick Setup

Each stage is just an HTTP Request node hitting Ollama’s API:

POST http://localhost:11434/api/generate

The key is the prompt engineering at each stage. Here’s the research stage as an example:

{
  "model": "llama3:8b",
  "prompt": "You are a research assistant. Research the following topic and provide 5-7 key points that should be covered in a comprehensive blog post. Include specific facts and statistics where relevant.\n\nTopic: {{ $json.topic }}",
  "stream": false,
  "options": { "temperature": 0.7, "num_predict": 1024 }
}

Each subsequent stage references {{ $json.response }} from the previous HTTP Request output.

Model Recommendations

Model	VRAM	Best For
`llama3:8b`	~5GB	Good all-around, fast
`mistral`	~4GB	Concise, fast
`llama3:70b`	~40GB	Highest quality

Tips

Set "stream": false — n8n needs the complete response, not chunks
Use different temperatures per stage — creative for drafting, conservative for editing
Add "num_predict": 4096 for the draft stage to avoid truncation
If Ollama is in Docker, use http://host.docker.internal:11434 instead of localhost

Has anyone else built multi-stage AI pipelines in n8n? I’d love to hear what patterns you’ve found useful!