How should I use Huggingface Inference API with AI agent?

Vikas_Kumar · December 28, 2024, 6:14pm

Describe the problem/error/question

I want to use huggingface inferencer API as Chat model in AI agents. I was trying to use OpenAI Chat Model with credentials & Base url from Hugging face (https://api-inference.huggingface.co/v1) with meta-llama/Llama-3.3-70B-Instruct but it is not working .
I tried usng OpenAI Chat Model with other providers (Hyperbolic, Deepseek) they are working fine.

What is the error message (if any)?

Not giving any reponse and after a whole request timeout

Please share your workflow

Share the output returned by the last node

Information on your n8n setup

instance information

Debug info

core

n8nVersion: 1.72.1
platform: docker (self-hosted)
nodeJsVersion: 20.18.0
database: sqlite
executionMode: regular
concurrency: -1
license: enterprise (production)
consumerId: 2bb05d2c-a30a-4926-8c16-27278742ca81

storage

success: all
error: all
progress: false
manual: true
binaryMode: memory

pruning

enabled: true
maxAge: 336 hours
maxCount: 10000 executions

client

userAgent: mozilla/5.0 (windows nt 10.0; win64; x64) applewebkit/537.36 (khtml, like gecko) chrome/131.0.0.0 safari/537.36 edg/131.0.0.0
isTouchDevice: false

Generated at: 2024-12-28T18:12:44.283Z

n8n · December 28, 2024, 6:14pm

It looks like your topic is missing some important information. Could you provide the following if applicable.

n8n version:
Database (default: SQLite):
n8n EXECUTIONS_PROCESS setting (default: own, main):
Running n8n via (Docker, npm, n8n cloud, desktop app):
Operating system:

Vikas_Kumar · December 28, 2024, 6:14pm

n8n version: 1.72.1
Database (default: SQLite): Default
n8n EXECUTIONS_PROCESS setting (default: own, main): Default
Running n8n via (Docker, npm, n8n cloud, desktop app): Cloud
Operating system: Windows 11

jpvanoosten · January 8, 2025, 4:30pm

Hi @Vikas_Kumar, I tried a bit, and it seems with the hugging face inference API, you need to use a different URL, one that includes the model. I found this out by clicking the “View code” button, and then “cURL”.

So, for the Hermes-3-Llama-3.1-8B model by NousResearch, I had to use this URL: https://api-inference.huggingface.co/models/NousResearch/Hermes-3-Llama-3.1-8B/v1/
In the new versions of the node, you have to specify this Base URL in the credential, instead of at the node-level.

Using the correct model as well, it did produce some output!

Hope this helps!

Vikas_Kumar · January 8, 2025, 10:32pm

so the new version of the node is not available right now, right ?
so will test the model using Base Url with node-level.

Please correct me if I am wrong

jpvanoosten · January 9, 2025, 4:47pm

That change went out in 1.74.0. If you’re not on that version yet, you can use the Base URL at the node-level.

Vikas_Kumar · January 9, 2025, 9:34pm

It worked but it is super slow and sometimes just do not give response (I am pro plan on hugging face and work well in playground of hugging face)
Did you face any issue like that ?

jpvanoosten · January 10, 2025, 8:00am

For me it was also slow, but I’m also on the free plan. Not sure what causes the slowness though.

Amr_Emad_El-Khouly · January 17, 2025, 2:24pm

How did you do it? used Hyperbolic?

Vikas_Kumar · January 25, 2025, 8:20am

you have to use OpenAI AI Agent but change the base url given by provider (hyperbolic or deepseek)

system · April 25, 2025, 8:20am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.