Add info.reason value to guardrails output

Nate_Swearingen · December 11, 2025, 4:59am

Guardrails node

The idea is:

Add info.reason description to LLM output when flagging a violation in the Guardrails node.

My use case:

I’m running evaluation messages through a complex workflow and it can be difficult to determine exactly why some seemingly benign messages get flagged as NSFW or jailbreak. Being able to log the reasoning behind the flag would be a great help in troubleshooting and refining the guardrails prompts.

I think it would be beneficial to add this because:

Having the LLM return a reason when flagging a message as a guardrails violation would greatly aid in determining the exact cause of the flag.

Add info.reason value to guardrails output

The idea is:

My use case:

I think it would be beneficial to add this because:

Any resources to support this?

Are you willing to work on this?