The idea is:
Introduce a feature that allows the first item in a streamed data sequence to carry metadata about the rest of the stream. This metadata could include schema definitions, field descriptions, source identifiers, or processing instructions relevant to the subsequent items in the stream.
My use case:
I’m building a chat interface that consumes streamed data from various sources. Depending on the type of data—whether it’s user messages, system alerts, educational content, or sensor readings—the interface needs to respond differently. For example, it might format responses differently, trigger specific workflows, or apply distinct validation rules. If the first item in the stream could include metadata describing the type and structure of the data to follow, the chat interface could dynamically adapt its behavior in real time without relying on hardcoded assumptions or external configuration
I think it would be beneficial to add this because:
-
It enables schema-aware processing without hardcoding assumptions.
-
It improves interoperability between nodes and external systems.
-
It allows for dynamic adaptation of workflows based on incoming data characteristics.
-
It reduces the need for separate configuration or pre-processing steps.
Any resources to support this?
-
Apache Arrow streaming format — uses metadata headers for schema definition.
-
JSON Lines with metadata preamble — some implementations support a metadata line at the start.
-
Internal n8n discussions around dynamic schema validation and adaptive node behavior.
Are you willing to work on this?
Yes, I’d be happy to contribute to the design and implementation, especially around metadata parsing, schema propagation, and node compatibility