Is there a way to see the reasoning output of the OpenAI models?

Hi,

I’m developing an agent to handle Telegram messages, find an available slot in one Google Calendar, and schedule a meeting in another Google Calendar.

My setup is using two Google Calendar tools: “Find Available Slot” and “Book a Meeting”.

There are many constraints to booking the call. I put all of that in the user message of the AI agent.

However, when it runs, it says that there are no available slots, but I know there are available slots because I checked the calendar and there are timeslots where bookings can be scheduled.

I suspect that the reasoning model, which is O4 mini, is not doing the work as I expected.

I can’t find a way to include the reasoning model’s output in order to debug and refine my prompt. Is there a way for me to do that?

Thanks,

L

Information on your n8n setup

  • n8n version: 1.94.1
  • Running n8n via (Docker, npm, n8n cloud, desktop app): Docker on Google Cloud
  • Operating system: Debian something
1 Like

Hey @DupervalAI ,

Yeah, do you have the output / workflow of the ai agent, you can goto execution and copy to editor if ran by production url, or if running in editor, just open the ai agent node, and on the right

If you connect a chat trigger to it, you should be able to promp similar and see how it processing btw on the right hand side, you see all the calls the ai agent makes, also they are developing this Request for feedback: Workflow evaluation beta - #32 by Memoire

which may help further.

Best regards,

Samuel

1 Like

Thanks for the response.

I understand about the debugging and looking at the flow output. What is missing, though, is the reasoning that the model does.

In the OpenAI API, there is a parameter called summary, and that parameter allows you to see how the AI tool came to its answer. That’s what Iwant to see in the output.

But summary is not a parameter that exists in the current integration of the OpenAI models. I was wondering if there was a way to add that parameter so that I can get the output when the LLM is called.

Thanks,

L

1 Like

There is definitely something wrong with my logic. Is there a best practices document somewhere that I can consult?

My prompt contains 3 distinct elements:

  1. Extract the sender’s name, email address, and notes from the Telegram message.
  2. Next, I expect the agent to look at the calendar and find a time slot that works.
  3. And finally, I expect it to take the email address, the name, and the notes that it extracted originally and put all that information into a Google Calendar meeting.

What I’m seeing is that there are two calls to the OpenAI API. The first one returns no data (the name, email, and notes are not parsed), while the second one shows no availability.

I took the same prompt, copied it, pasted it in the same model in ChatGPT, and it gave me the expected results although it invented dates for the meeting because it doesn’t have access to my calendar.

So there’s clearly something I don’t understand about the interaction between the agent, the LLM, and Google Calendar.

Maybe I’m asking it to do too much in one operation. Maybe I need to split them in multiple operations or into multiple workflows. But then to me it seems to defeat the purpose of the agent making all the decisions because it has the tools at its disposal.

L

1 Like

@DupervalAI

If one ai agent is failing at the task, its normally because of the system and user prompt, maybe try break down the steps, with three different agents, and be more precise on for prompt, This would yield more reliable results, have you tried this?

There is stuff on Docs, I would go with the multi-agents in the flow, you can still connect tools. I can try make something quick for you, do you have ure example ure using atm?

If your happy to share, I can make a quick edit and see how it perform.

n8n attach workflow


something like this, maybe (you can use ai agent or openai llm node.)
Best regards,

Samuel

Thanks for the reply.

I ended up giving more details in my prompt, and now I’m getting farther. So now it is booking, but it’s not booking correctly. It’s using the available slots I searched for instead of finding a smaller slot to create a 30-minute meeting.

At least now I see that the agent is calling the tools correctly. I think the change I made in the logic helped. I read the documentation for Google Calendar, and I realized that I was understanding the availability aspect incorrectly.

I switched to getting all the bookings instead of availability and asked the LLM to analyze them to find a suitable time slot. It’s not quite there yet, but it’s moving along.

I appreciate the offer to do it for me, but that’s not my goal. My goal is to learn how to make these things work, and it’s challenging my understanding of what AI agents do.

I’m not sure if I’m misunderstanding AI agents and giving them too much credit, or if I don’t fully grasp how N8n specifically implements them.

But there has been progress, and that’s a good thing!

Thanks,

L

1 Like

@DupervalAI your welcome, and yes depends on model used, I find too sometimes they play dumb, I dont know if alot of people are using the model at that time so less resources or it needs a restart lol, but yes when it get mroe conflex they can struggle, hoping all goes well and it’s a nice learning curve i’m loving it tbh :slight_smile:

Have a nice day.

Samuel