Locally hosted LLM is not able to call tools

fwasmeier · June 25, 2025, 6:19pm

Hi,

my company and I are so super happy to test out n8n in order to make an educated decision if this is our tool moving forward into the era of Agentic AI. Therefore we are trying to run our local llama 3.3 super 49b with n8n AI Agent and got a problem:

Describe the problem/error/question

I have a on-prem hosted NVD Nim container llama3.3 super:

Set everything up connected everything to an AI Agent and tried to chat. Worked like a charm.
But the next step is to let the AI Agent use Tools. Therefore i changed the type to “Tools Agent” and added a simple WebEx Tool, where a message gets sent to a Room.

THE PROBLEM: LLM can correctly identify the tool and use it, but the tool never gets called. Instead the tool call is outputted via chat:

That is also visible in the executed nodes which shows that the WebEx Tool is not used:

Please share your workflow

Share the output returned by the last node

The Ai Agent then outputs:
[{
“output”: “[{“name”: “Create_a_message_in_Webex_by_Cisco”, “arguments”: {“Text”: “Hi Test”}}]”
}]

Expected output

We would expect the local LLM to be able to use Tools from the AI Tools Agent in n8n and get back to the use with output when the tool usages are finished.

just FYI, with GPT Models (API use) for example this exact workflow is working.

Looking forward to reading your ideas to get this going,
Flo

Information on your n8n setup

n8n version: 1.100.0
Database (default: SQLite): default
n8n EXECUTIONS_PROCESS setting (default: own, main): default
Running n8n via (Docker, npm, n8n cloud, desktop app): docker-compose self-hosted
Operating system:

Jon · June 27, 2025, 8:44am

Hey @fwasmeier,

Part of the problem could be that a lot of local LLMs are not “smart” enough to know they need to call a tool and don’t support them.

Have you checked to see if the model you are using does allow tool use?

Wouter_Nigrini · June 27, 2025, 8:48am

See below link for llama models supporting tool calls.

https://www.perplexity.ai/search/which-llama-models-support-too-AdeJi20YR1GXDzwxIMOwwg

Llama Models That Support Tool Calls

Several Llama models support tool calling (sometimes referred to as function calling), allowing them to interact with external APIs, functions, or services. Here’s a breakdown of which Llama models offer this capability and the types of tool calling they support:

Official Meta Llama Models

Llama 3.1
- Supports JSON-based tool calling natively.
- Widely implemented in platforms like Ollama and Groq, enabling agentic automation and integration with external tools or APIs1 2 3 4.
- Available in various parameter sizes (e.g., 8B, 70B, 405B).
Llama 3.2
- Extends upon 3.1 with continued support for JSON-based tool calling.
- Introduces “pythonic” tool calling, a more flexible and Python-friendly format1.
Llama 4
- Supports both JSON-based and the new pythonic tool calling format.
- Recommended to use the pythonic tool parser for best results.
- Supports parallel tool calls, a feature not available in Llama 3.x1.

Community and Fine-Tuned Models

Fine-tuned Llama 3 Models
- Community projects have fine-tuned Llama 3 (e.g., Llama3-8b-instruct) for enhanced function/tool calling, including LoRA adapters and quantized versions for efficient local deployment5 6.
- These fine-tuned models are trained on datasets specifically designed for function calling tasks and are available in different formats (16-bit, 4-bit, GGUF for llama.cpp, etc.).
TinyLlama
- A smaller, fine-tuned variant with tool/function calling support, suitable for resource-constrained environments6.

Comparison Table

Model	Tool Calling Support	Format(s) Supported	Notable Features
Llama 3.1	Yes	JSON-based	Native support, broad adoption
Llama 3.2	Yes	JSON, Pythonic	Adds pythonic tool calling
Llama 4	Yes	JSON, Pythonic	Parallel tool calls supported
Llama3-8b-instruct*	Yes (fine-tuned)	JSON-based	Community fine-tune, local use
TinyLlama*	Yes (fine-tuned)	JSON-based	Small, efficient, fine-tuned

*Community fine-tuned models, not official Meta releases.

Key Points

Llama 3.1, 3.2, and 4 all support tool calling, with increasing capabilities and flexibility in newer versions1 2 3.
JSON-based tool calling is the standard across all, while pythonic tool calling is introduced in 3.2 and recommended for Llama 41.
Parallel tool calls are only supported in Llama 41.
Fine-tuned models such as those from the “unclecode” repository extend tool calling to smaller or more specialized Llama variants5 6.

In summary, if you need tool calling support, choose Llama 3.1 or newer. For advanced features like pythonic tool calling and parallel execution, Llama 4 is recommended. Fine-tuned community models are also available for specific use cases or lightweight deployments.

fwasmeier · June 27, 2025, 1:54pm

Hi @Jon,

thank you for reaching out quickly.

If you take a look at the model card from NVD it specifically states that the model is trained for tool calling.

I also tested if it was smart enough by changing the Agent Type to “OpenAI Functions Agent”. With that Agent type i get a 400 error (no body) error response from n8n.

This is the console log of using the Model in the “OpenAI Functions Agent”:

2025-06-27T11:08:45.467Z | error | 400 status code (no body) {"file":"error-reporter.js","function":"defaultReport"}
2025-06-27T11:08:45.467Z | debug | Running node "AI Agent" finished with error {"node":"AI Agent","workflowId":"jOqu92akylxZQm06","file":"logger-proxy.js","function":"exports.debug"}
2025-06-27T11:08:45.467Z | debug | Executing hook on node "AI Agent" (hookFunctionsPush) {"executionId":"6809","pushRef":"wvad9rmsml","workflowId":"jOqu92akylxZQm06","file":"execution-lifecycle-hooks.js"}
2025-06-27T11:08:45.468Z | debug | Pushed to frontend: nodeExecuteAfter {"dataType":"nodeExecuteAfter","pushRefs":"wvad9rmsml","file":"abstract.push.js","function":"sendTo"}
2025-06-27T11:08:45.468Z | debug | Workflow execution finished with error {"error":{"level":"warning","tags":{},"context":{},"functionality":"configuration-node","name":"NodeApiError","timestamp":1751022525464,"node":{"parameters":{"notice":"","model":{"__rl":true,"value":"nvidia/llama-3.3-nemotron-super-49b-v1","mode":"list","cachedResultName":"nvidia/llama-3.3-nemotron-super-49b-v1"},"options":{}},"type":"@n8n/n8n-nodes-langchain.lmChatOpenAi","typeVersion":1.2,"position":[-840,-320],"id":"cec34fcd-ddfd-4bcb-b4bd-b97031e8ee17","name":"Local","notesInFlow":true,"credentials":{"openAiApi":{"id":"dX1EaCNOPnvtPwDG","name":"Local Reasoning Model"}}},"messages":["400 status code (no body)"],"httpCode":"400","description":"400 status code (no body)","message":"Bad request - please check your parameters","stack":"NodeApiError: Bad request - please check your parameters\n    at Object.onFailedAttempt (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/@n8n+n8n-nodes-langchain@file+packages+@n8n+nodes-langchain_9ca6f82764a6c40719e9f8a538948cbd/node_modules/@n8n/n8n-nodes-langchain/nodes/llms/n8nLlmFailedAttemptHandler.ts:26:21)\n    at RetryOperation._fn (/usr/local/lib/node_modules/n8n/node_modules/.pnpm/[email protected]/node_modules/p-retry/index.js:67:20)\n    at processTicksAndRejections (node:internal/process/task_queues:105:5)"},"workflowId":"jOqu92akylxZQm06","file":"logger-proxy.js","function":"exports.debug"}

!!! IMPORTANT
The part that is making me curious: When i use Plan & Execute Agent with the model and attach a Wikipedia tool, it is able to use the tool and come back to the user.

The tests above tell me that the model is in theory capable of calling tools, but somewhere else might be a problem I am not able to see.

Questions that came up from this:

is there an difference in how tools are called between the Plan & Execute Agent, OpenAI Functions Agent, Tools Agent
Which Agent would be the correct one for this case (the llama3.3 uses the OpenAI API standard)

If I can help you with more information or debugging logs please let me know and I am happy to assist

fwasmeier · June 27, 2025, 1:57pm

Thank you @Wouter_Nigrini for providing information. From the model card it is viable for tool calling. Also testing with the Plan and Execute Agent type resulted in a successful usage of the Wikipedia tool already. Problems come with Tools Agent or OpenAI Functions Agent.

oyavuz · August 27, 2025, 12:08am

I ran into the same problem and was able to fix my specific case.

I was using vllm to run the mistral model, and I had to add these arguments:
–enable-auto-tool-choice --tool-call-parser mistral
It seems I was using the llama tool call parser before which was making n8n not recognize it as a tool call.

I don’t know how you are serving the model but seems like your parser is not recognizing the output from the llm, as was mine.

Hope this helps.

system · November 25, 2025, 12:08am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.