Need reliable method to clean and structure complex Apify Skip Trace JSON for OpenAI

Hey Nikos_Bosse

Im Actually not working on that Worklfow anymore now im working on antoher one and im really struggling so if you have any adivce if it’s a serivce you offer and i need to pay i really don’t have the money to make that invsetment but here is the workflow

I’m building a cold-calling research workflow in n8n and I’m running into a confusing issue with the OpenAI (ChatGPT) node that I can’t seem to resolve.

Workflow (very simple i think)

  • Manual Trigger

  • Google Sheets (1 row at a time)

  • OpenAI Chat Model node

No extra parsing or transformations.


The problem

I’m using a prompt that works perfectly when I run it directly in the ChatGPT UI (chat.openai.com), but when I use the exact same prompt inside the n8n OpenAI node, the output becomes very generic and full of "N/A" values.


Example

Prompt intent

Research a real business and return structured JSON with:

  • Design project / business name

  • Focus area

  • Social media platform, activity level, and followers


Result in ChatGPT UI (correct, detailed, fact-checked)

{
  "Design Project": "7 Plates Cafe - Chicago, IL",
  "Focus Area": "Hospitality & Commercial Interior Design",
  "Social Media Presence": {
    "Platform": "Instagram",
    "Activity Level": "High",
    "Followers": "Approx. 6.9K+"
  }
}


This output is accurate and matches real-world data.


Result in n8n OpenAI node (same prompt)

{
  "Design Project": "N/A",
  "Focus Area": "Interior Design",
  "Social Media Presence": {
    "Platform": "Twitter",
    "Activity Level": "Low",
    "Followers": "N/A"
  }
}


This happens consistently across different businesses.


What I’ve already checked

  • Tried different models (GPT-4 / GPT-4o)

  • Adjusted temperature

  • Tested JSON vs text output

  • Confirmed prompt content is identical

  • No output parsers or extra nodes involved


Question

Is there a known difference in:

  • Inference behavior

  • Entity resolution

  • Or safety defaults

between the ChatGPT UI and the OpenAI API used by n8n, that would cause the model to return conservative "N/A" placeholders unless explicitly instructed otherwise?

If so, is there a recommended way in n8n to:

  • Enable more confident inference, or

  • Match ChatGPT UI behavior more closely for research-style prompts?

Any guidance or examples would be greatly appreciated :folded_hands:

Seems like an input issue to me. Could that be? I.e. I’m suspecting that your prompt just doesn’t get all the relevant content it needs.

I Actually got it to work kinda.

the problem is 3 things

First, I fact-checked the information, and it’s not completely accurate or as detailed as I expected. For example, instead of saying “Focus area: interior design,” I would prefer something more specific, like “residential hospitality interior design.” It also mentioned “interior design funding,” which isn’t correct. I noticed it used 73,000 tokens — is that a lot, and if so, is there a way to reduce that?

Second, I checked my Google Sheets, and although it pulled some information, it didn’t write anything back into the “Prospect Note” column where I wanted it to.

Whether 73k tokens is a lot depends on the amount of data, reasoning and research. Can’t really say anything without knowing that. I would probably worry about quality first and then token count.

Are you self-hosting n8n?

Oh and how does your AI agent work? Does it have any access to the internet? If not, that could explain the difference between the ChatGPT UI and the prompt that you described in one of the messages above.

Hey Nikos,

yeah i agree i would rather have quailty infomation and use more tokens and the qustion about the AI agent having acces to the interntet i really don’t know im pretty new to all of this