Cost of using Chatgpt(Tokens)

Describe the problem/error/question

Hi! i want to know how i can reduce tokens spend on my ChatGPT, cuz in my workflow it take around 100-150k tokens, which is crazy

What is the error message (if any)?

Please share your workflow

my workflow is validating my text(It's sattelite-operator Formal Letter) which looking in vector stores(there are 2 of them) which contains rules and Case card, they are describing how to find an issue. + Call another workflow(another Data base, which contain a lot of information, full information) 

Share the output returned by the last node

Information on your n8n setup

  • n8n version:
  • Database (default: SQLite):
  • n8n EXECUTIONS_PROCESS setting (default: own, main):
  • Running n8n via (Docker, npm, n8n cloud, desktop app):
  • Operating system:

100-150k tokens per workflow execution is definitely high!

1 Like

Questions to help @Garfield_Gg

  1. What does your workflow do? (e.g., analyze documents, chat, data processing)

  2. How many AI nodes do you have in the workflow?

  3. Are you using chat memory? If yes, what’s the window size?

  4. What model are you using? (GPT-4, GPT-3.5, etc.)

  5. Are you processing items in a loop?

Let me know!

1 Like

Hi @Garfield_Gg, welcome to the n8n community! In almost every case of extreme token usage I’ve helped troubleshoot, the issue wasn’t the model itself, but the amount of context being sent (vector stores + full databases + open-ended prompts). By reducing Top K, filtering data before calling the LLM, and narrowing the prompt scope, it’s common to significantly reduce token consumption per execution. If this doesn’t fully address your question, feel free to share more details about your workflow so we can take a closer look.

Thanks for reply! 1) so, what’s doing my workflow it’s validation and verification a letter( that letter is including information for operator, Conjuction data message, + giving own plans) My workflow analyze text to be sure, that information is correct according to provided CDM_ID. Checking is there any Ambiguous phrase or smth like that(to be sure that Operator B will understand the letter without problem) + it checking rounding of numbers. And also checking, did Operator A give enough information for Operator B to understanding? Smth like that.
2) One AI node
3) yes i am using(window size is 15)
4) I am using GPT-5-mini
5) I don’t know to be fair :melting_face: but i think yes(maybe this is the main problem)

i gave information for John, you can look there
Yeah i want to reduce my token usage by reducing with TOP K, but i have no idea how i could do it. Because if i try to give rules in my vector store. it will still look every rule before he understand. Either i can put some prompt in AI agent already

what i want to add, that chat gpt is using his memory and yourself 3+ times in one input. maybe there is problem?

@Garfield_Gg
Thanks for the extra details, that helps a lot.
The high token usage is very likely caused by three things. A memory window of 15 is usually unnecessary for a validation task and resends a lot of previous context on every run, so I would reduce it to 3 or temporarily disable it to test. Another common issue is the AI node running inside a loop. Even with a single AI node, if it executes once per item, token usage can quickly reach 100k. It is worth checking whether the node runs more than once per letter. Lastly, be careful with the context you send from vector stores and the secondary workflow. Avoid passing full rules or full database content. Limiting Top K and sending only the relevant excerpts usually reduces token usage significantly without affecting accuracy.

Thanks for the advice! as i said, my AI is runs 3+ times, i tried to reduce it to 1 or 2. But i have no idea how i could do this, i am trying to avoid, but there is no success. Could you please give more instruction, how i can probably reduce it. Thanks for the patiens!

@Garfield_Gg I usually fix this by controlling the input, not the model. I add a Set or Code node right before it and merge all text and rules into a single field, so it runs only once. Memory is a separate issue, it only increases the context size, so for this use case I reduce the window to 3 or disable it to test. I also keep Top K low, usually 3 to 5, and rely on proper chunking instead of moving all rules into the prompt. With these changes, it normally runs once and token usage drops significantly.

1 Like

Hi @Garfield_Gg Welcome!
As far as i have understood vector stores do not consume that much tokens, here the real problem might be the data base and how the AI Agent is reading information from it, is it getting all rows or fetching everything or just a fraction of what exists inside, fixing your data base handler will reduce your token cost as that is the most token expensive part in the entire workflow, hope this helps.

thanks! now my AI spends 8-10k, maybe 20k sometimes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.