The idea is:
Currently, the AI Agent emits message objects of types “begin”, “item”, and “end” when streaming information. I would like to propose introducing an additional type, such as “tool_call” or similar, to explicitly indicate when the agent is invoking a tool or performing an action while streaming.
My use case:
During streaming, it is important for users to not only see the incremental content but also to be informed when the agent initiates a tool call (e.g., querying a database, making an API call, or running a calculation). Modeling this with a new message type would make it much easier to present this status in the UI, increasing transparency and improving user experience.
I think it would be beneficial to add this because:
At the moment, "begin", "item", and "end" types are limited to representing content streaming. In the non-streaming variant, information about tool calls is already available. However, during streaming, this detail is missing. By introducing an explicit "tool_call" message type (or similar), the UI can clearly inform users in real time:
-
When and which tool is being called
-
The progress or status of the agent throughout the streamed interaction
-
What actions are taking place at each streaming step
This provides much clearer feedback to end-users, enhances trust and explainability, and supports better debugging during development.
Any resources to support this?
ChatGPT function calling / tool call events
Inspiration can be taken from the OpenAI API design, which uses similar marker stages for tool-calling event
Are you willing to work on this?
Yes