How does langchain and n8n handle deterministic JSON output?

Dear all,
I have questions regarding how langchain and n8n handle deterministic JSON output for OpenAI and MistralAI.

From what I understand. OpenAI used to take a JSON schema as input parameter but it is not the case anymore !? We have to explicitly ask the LLM to output a JSON.

But how can I make sure it is deterministic ?
How does langchain and n8n handle it ?
Should I use function calling instead ?

Thanks

It looks like your topic is missing some important information. Could you provide the following if applicable.

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

Hey @LucBerge

If you tick Require Specific Output Format and add schema in Structured Output Parser subnode for example as below:

{
  "tags": [ "string" ],
  "isHomework": "boolean"
}

Then this expected schema will be append to prompt, you can check it by executing and go to logs tab:

Hope that helps;)

Hello,
Ok ! So this is not deterministic then !?

There is a small probability that the output does not match the requested format ?!

yes, that’s possible especially for GPT 3.5, for GPT4 I did not experience that.
There is also auto fix sub node, did not use it yet, but if you need to be sure that execution will not fail that might be worth to have a look:

Also you could use retry on fail
image

Let say that the model gives 1% incorrect json format:

  • If I use auto-fixing output format, it lowers the probability without removing it : 0.01 * 0.01 = 0.0001
  • If I use retry on fail with max tries to 3: 0.01^3 = 0.000001

It is not deterministic.

What about using function calling for deterministic JSON output format ?


The workflow I am planning to do will process up to 10000 requests and I cannot afford a single fail… If I increase the max tries variable, it could become expensive regarding the number of requests.

Maybe someone else will be able to help you better. I can only say that for GPT-3.5, it sometimes generates incorrect JSON. A good prompt makes a big difference, but AI is by design not deterministic, so I think that you will not be able to make sure that the AI returns correct JSON in 100% of cases. However, you could use the autofix subnode, which might make it happen, but I don’t have experience with that

You mentioned function calling, so do you know about the AI Agent node that can gather tools and make decisions by itself which to use? I also haven’t used it much, but maybe that could help you somehow

You mentioned function calling, so do you know about the AI Agent node that can gather tools and make decisions by itself which to use? I also haven’t used it much, but maybe that could help you somehow

Yes I have already used it but the way langchain is integrated in n8n won’t allow me to access the JSON output between LLM and the actual call of the tool in n8n. I want to have access to the raw output like:

{'aspects_and_sentiments': [
  {'aspect': 'food', 'sentiment': 'positive'},
  {'aspect': 'ambiance', 'sentiment': 'negative'},
  {'aspect': 'waiter', 'sentiment': 'positive'},
  {'aspect': 'pizza', 'sentiment': 'positive'},
  {'aspect': 'burger', 'sentiment': 'positive'},
  {'aspect': 'coke', 'sentiment': 'negative'},
  {'aspect': 'drinks', 'sentiment': 'negative'}
]}

See full article: https://sauravmodak.medium.com/openai-functions-a-guide-to-getting-structured-and-deterministic-output-from-chatgpt-building-3a0ef802a616

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.