AI Assistant Workflow Issues: Tool Ignoring, Long Execution, and Incorrect Responses

I’m currently facing several critical issues with an AI assistant workflow designed to respond to customers on Instagram and Facebook.

System Overview

  • AI-powered chatbot workflow (n8n)

  • Multi-channel: Instagram & Facebook

  • Product data stored in Supabase

  • Two interaction paths: text-based queries and image-based queries

  • AI Agent with strict system and user prompts

  • Dedicated tools for product retrieval (Hybrid RAG / Supabase search)

  • im using gemini 2.5 flash


Problems Encountered

1. Long Messages Cause Execution Failures

When a customer sends a long message or conversation history, the AI agent:

  • Takes a very long time to execute

  • Eventually fails with an execution error

  • Causes a poor user experience and timeout issues


2. AI Agent Ignores Tools

Despite having a clearly defined tool for retrieving products from Supabase:

  • The AI agent sometimes does not call the tool

  • It falls back to its internal memory

  • This results in hallucinated or incorrect product responses

This behavior completely breaks data reliability.


3. Prompts Are Not Respected

I have implemented:

  • Strict system messages

  • Advanced user prompts

  • Explicit rules (e.g. “ONLY use the tool”, “NEVER invent products”)

  • Even aggressive constraint language

However:

  • The AI agent still ignores instructions in some cases

  • Prompt enforcement is inconsistent

  • Tool-usage rules are not reliably followed


Core Issue

The main challenge is lack of deterministic control over:

  • Tool invocation

  • Execution time

  • Prompt compliance

Even with advanced prompt engineering, the AI agent sometimes behaves unpredictably.


What I’m Looking For

  • Reliable methods to force tool usage

  • Better architectural patterns than large prompts

  • Techniques to handle long conversations safely

  • Ways to prevent fallback to model memory

  • Production-grade AI agent control strategies


If you’ve dealt with similar AI agent reliability issues (n8n, Supabase, RAG, tool calling, Instagram/Facebook bots), I’d really appreciate your insights.

Hey @Mus4ever , i understand you are looking for multiple ways to streamline and correct your workflow stability and reliance, i have more complex workflows and the thing i have learnt is that the n8n itself does not cause the error its the AI Model we have chosen which cases reliability issues, instead of giving you ‘Best Practices’ i would suggest use the top tier models, i strongly recommend using Claude_4.5 it is the most reliable one by far, and if you are using this setup of yours i recommend setting guard clauses, error reporter & agent output validator, that is really it, this helped me lot more than setting some BEST PRACTICES statements. Hope this helps.

1 Like

Hi @Mus4ever !

The issues you’re seeing are not caused by prompt engineering, but by architecture.

In production workflows, an LLM should not be responsible for deciding when to call tools or retrieve data. Those decisions must be handled by deterministic logic in n8n. The AI model should be used only to generate the final text based on already-resolved data.

A safer and more reliable structure looks like this:

User message
├─► n8n logic (IF / Switch)
│ ├─ product-related → fetch data from Supabase
│ └─ general question → skip database
└─► AI model formats the response using the provided data

How to resolve:

1. Long messages causing slow executions or timeouts
Do not send the full conversation history to the AI.
Instead, pass only the most recent message or a short summary. This keeps execution time predictable and avoids timeouts.

2. AI Agent ignoring tools
Do not rely on the AI agent to decide whether to call Supabase or other tools.
Fetch data using standard n8n nodes (Supabase, HTTP, SQL) before calling the AI, then pass the results to the model as input. The AI should only format the response, not retrieve data.

3. Prompts not being respected or hallucinated responses
Remove decision-making from the AI entirely.
Use n8n nodes (IF / Switch) to route intent and apply rules, then instruct the model clearly:

“Answer using ONLY the data below. If no data is provided, respond with ‘No products found.’”

Because the data is already resolved, the model cannot fall back to its own memory or invent information.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.