Stream AI responses on HTTP responses, LLM chains, and AI agents nodes

dretz · August 18, 2024, 10:18pm

The idea is:

To add streaming capabilities to HTTP responses, LLM chains, and AI agents in n8n.

My use case:

We’re heavy users of other automation tools, but recently I tried n8n and fell in love with it. Unfortunately, we can’t use n8n for our projects because it lacks the ability to stream responses, which is essential for us.

I think it would be beneficial to add this because:

Without streaming, it’s hard to use AI agents in n8n because we can’t see what’s happening during the process. And in general this makes the experience less user-friendly. Almost all AI tools today include streaming, so adding it to n8n would make it much more useful.

Are you willing to work on this?

I’m interested in working on this feature, but I’m new to n8n and could use some guidance. I’m worried I might not do it right or it might take me a long time. Any help or direction would be greatly appreciated.

Could we use the “ai” npm package by Vercel to help with this? Also, do we need to make changes to the HTTP responses, LLM chains, and AI agents nodes, or could we just modify the HTTP responses?

If you’re also interested in this feature, upvote this post!

anitsukai · August 19, 2024, 12:59am

omg! I hope you make it because this is the only thing that force me to use another solution that is a worse in many other aspects

Moy_Moussan · August 29, 2024, 2:32pm

I explored an alternative approach to achieve this, although it isn’t fully fine-tuned yet.

Please note, this method only works on self-hosted n8n instances where the environment variable is set to allow importing external components in the ‘Code’ node.

This setup allows you to capture chunks from the OpenAI API (SSE) and stream them in real-time to a webhook (which could also be hosted on n8n) for further processing, such as generating audio with TTS, send messages, or forwarding via another protocol like WSS. While not perfect, it has proven effective in reducing latency for some voicebots I’ve implemented with TwiML. It would be even more beneficial if n8n’s webhook or chat framework supported SSE or WSS directly.

Step-by-Step Guide to Implement Streaming with OpenAI in n8n

Step 1: Configure the OpenAI Node

Select the OpenAI Model:
- Add an OpenAI node to your workflow in n8n.
- Choose the desired model (e.g., gpt-4o) and enter your OpenAI API key directly in the code node (not as an environment variable).
Override Host URL:
- Set the Host URL in the OpenAI node settings to the URL of a webhook that will trigger a Code node.
- Example: https://your-n8n-instance/webhook/openai-trigger.

Step 2: Set Up the Webhook and Code Node

Add a Webhook Node:
- Drag a Webhook node into your workflow and configure it to trigger on the URL you set in the OpenAI node (/webhook/openai-trigger).

Add a Code Node:

Connect the Webhook node to a Code node.
Paste the following code into the Code node:

const OpenAI = require('openai');
const fetch = require('node-fetch'); // Import node-fetch to send HTTP requests

// Initialize OpenAI client with API key
const client = new OpenAI({
    apiKey: 'your_openai_api_key_here'  // Replace with your OpenAI API key directly in the code
});

async function main() {
    try {
        const stream = await client.chat.completions.create({
            model: 'gpt-4o',
            messages: [{ role: 'user', content: 'Say this is a test' }],
            stream: true, // Enable streaming mode
        });

        // Process each chunk of the stream
        for await (const chunk of stream) {
            // Log each chunk to the console for debugging
            console.log('Received chunk:', chunk);

            // Send the full chunk to another webhook as received
            await fetch('https://your_webhook_url_for_chunks', {  // Replace with your webhook URL
                method: 'POST',
                headers: {
                    'Content-Type': 'application/json',
                },
                body: JSON.stringify(chunk), // Send the entire chunk object
            });
        }

        // Return a response in the format that n8n's OpenAI node expects
        return {
            data: [{
                completion: "Completed streaming data response",
            }],
        };

    } catch (error) {
        console.error('Error during streaming:', error);
        return [{ json: { error: error.message } }];
    }
}

return main();

Replace Placeholders:
- Replace 'your_openai_api_key_here' with your actual OpenAI API key.
- Replace 'https://your_webhook_url_for_chunks' with the URL where you want to forward the streamed data.
- Replace hardcoded text by n8n variables (like text, config options, etc).

Step 3: Handle Chunks in Webhook Node

Configure the Webhook to Handle Chunks:
- Ensure the webhook receiving chunks is set up to process JSON data.
- This webhook should handle each chunk of data as it arrives and respond using “respond immediately” option with a status code.
Send Data Back to OpenAI Node:
- Ensure the response sent back to the OpenAI node is formatted correctly (e.g., with the necessary JSON structure expected by n8n).

Step 4: Save and Test Your Workflow

Save Your Workflow:
- Click Save in n8n to ensure all changes are stored.
Execute and Monitor:
- Run the workflow and monitor the console for logs.
- Verify that data is correctly streamed to your webhook and processed without errors.

dretz · September 1, 2024, 5:43pm

Great! Thanks for sharing this!

Morriz · November 6, 2024, 2:56pm

Hi, I tried to get attention by injecting it as a bug (since not being able to stream is going so hard against expectations by the public, as that is deemed to be possible/available) here:

github.com/n8n-io/n8n

responses are not streaming

opened 02:46PM - 06 Nov 24 UTC

Morriz

in linear

### Bug Description Hi, responses are not streaming, nor do I see the option to… choose between streaming or aggregate response...am I missing something here? I think most of the AI world is used to tools that work with token streams. ### To Reproduce 1. Use any AI / LLM node to get output 2. Observe no streaming output ### Expected behavior Streaming output to work ### Operating System Docker ### n8n Version 1.64.3 ### Node.js Version Docker ### Database SQLite (default) ### Execution mode main (default)

I suggest commenting/reacting there also to get attention and raise some awareness with the n8n folks
…

Marvin_Jansen · December 18, 2024, 6:29pm

Commenting here to bump it up, Streaming is essential for AI responses, please add this feature as soon as possible, n8n becomes more and more popular in the AI scene but the other plattforms do offer Streaming out-of-the-box

Yigit_Konur · January 25, 2025, 9:38pm

could not agree more! can’t believe that it is not availble still- n8n is the best on the market but MOST important feature is missing and we have to get the real pain with langflow / dify and other ones as streaming is not available

Nicolas_M · January 27, 2025, 5:33pm

Yes ! this is an important feature

sevendays · January 27, 2025, 10:21pm

I can only agree it is the only thing that stops me from using n8n and I’m looking for other providers that support the feature. The Webhook should support the latest node with streaming to client

Vincent_Teyssier · March 12, 2025, 11:02am

we’re currently exporting n8n json and converting into langgraph as this is the only way to get proper streaming.
I hope they implement soon streaming of agent steps and of LLM responses as well

Fredrik_Ekelund · March 16, 2025, 11:07am

Have you automated the conversion to langgraph? If so, something you would care to share? BR / Fredrik

chewy · April 9, 2025, 9:14am

bump! ai streaming is a MUST these days

romeuow · April 15, 2025, 2:03pm

Considering its importance and value, this feature should be treated as a MAJOR priority.

antonio_n8n · April 15, 2025, 7:16pm

Totally supporting this. Streaming will be huge deal for many use cases out there and right now is supported by most of the latest LLMs. Please team consider it in your roadmap!

yagrab · April 24, 2025, 8:46am

We need this!!! Would be such a gamechanger for any chatbots in production built with n8n. Users are also used to this and expect a responsive chat window where they can see that something is working in the background and then see the response building up in real-time

Stuart_R · May 2, 2025, 5:57am

would surely be something in the pipeline… this would certainly save me having to create a workaround…

Daniel_Freitas · May 15, 2025, 9:09am

Hello good afternoon my friend, how are you doing the import? Are you using any automated method, has it been working?

papajbeautiful · May 15, 2025, 9:43pm

Critical for any modern AI use case. When combined with the duplicate feature request (refer below), this is now the most requested feature by a significant margin.

Please prioritise this feature!

SHENRUIYANG · May 16, 2025, 3:40pm

Same issue! No solution?

Sanjay_Akut · June 10, 2025, 4:38pm

I agree that this is a HUGE priority and at this point is a bet of “catch up” with other platforms that offer this functionality right now.