PSA - Extracting output from Gemini models

I tried for more than an hour to prompt and system promt Gemini to not wrap it’s response in markdown or markdown code blocks. No matter what it is completely hit or miss. You cannot rely on Gemini returning clean text, json, html, or any other format.

So, I’m now putting this node immediately after each Gemini model response.

It will extract the output from the markdown codeblock if it exists. If not, it will pass the response as is.

const llmMarkdown = $input.first()?.json?.output ?? "";

// Find the content inside the markdown code block.
const match = llmMarkdown.match(/```(?:[a-zA-Z]+)?\n([\s\S]*?)\n```/);
const llmText = match ? match[1].trim() : llmMarkdown;

try {
  // Try parsing as JSON
  return [{ json: {output: JSON.parse(llmText) }}];
} catch {
  // If it's not JSON, return as is.
  return [{ json: { output: llmText } }];
}

Cheers. :beers:

2 Likes

Are you using Gemini 2.0 Flash?

I’m using 2.0 Flash and 2.0 Pro.

The models have been trained to generate JSON in a code block so it’s better to just go for that every time.

If your output schema is always the same:

  1. Set the AI node to generate structured output (There is a button for this)
  2. Attach structured output parser
  3. Give examples of the correct output (these need to include the markdown code block elements)
  4. Specifically mention in the prompt it needs to include markdown code blocks and not to forget closing 3 ticks

I have Gemini 2.0 Flash producing very reliable JSON with this setup.

Do you think you can post a sample structure? Been messing with this idea for a couple of hours with no success

The solution I shared isn’t tied to any specific structure—it’s just meant for cases where the model responds with something inside markdown code blocks.

That could be:
```json
```html
```text
etc.

I was just looking to capture whatever is inside the code block, not enforce a particular format.
And, If the response isn’t in a code block, to get the response as is.

Or, @Tero, maybe I’m misunderstanding—does your approach handle this as well? :thinking: It looks like your approach is aiming for a specifically structured response inside code blocks.

Can you please give an example as I am getting json markdown sometimes and sometimes without markdown . Without markdown gets processed easily the markdown one creates issues.

If you put the code node I posted above directly after your Agent, it will handle text inside or outside of markdown and pass it through for you.

Absolute legend! I was struggling with the same issue using the latest N8N and Gemini 2.5. They REALLY need to fix the structured output node so that it actually uses the functionality offered by Google and OpenAI instead of the now, very outdated method of using langchain in the background to prompt an output structure…

1 Like

Thanks @elabbarw. Glad to help.
I don’t see a way to edit the OP any longer, so posting an update here. I found that Gemini will often incorrectly escape single quotes and it would lead to extraction errors. This version provides a fix.

const prevNodeName = $prevNode?.name ?? 'Unknown';
const output = $input.first()?.json?.output ?? '';

if (!output) {
  throw new Error(`No input data found from previous node ${prevNodeName}`);
}

const match = output.match(/```(?:[a-zA-Z]+)?\n([\s\S]*?)\n```/);
const jsonString = match ? match[1].trim() : output.trim();

// Clean the common LLM error of escaping single quotes
const cleanedJsonString = jsonString.replace(/\\'/g, "'");

try {
  const parsedJson = JSON.parse(cleanedJsonString);
  return [{ json: parsedJson }];
} catch (error) {
  throw new Error(
    `Failed to parse output as JSON from previous node ${prevNodeName}: ${error.message}`,
    {
      cause: {
        original_text: jsonString,
      },
    },
  );
}