Anthropic Chat Model node — How to enable prompt caching (cache_control)?
The idea is:
My use case:
I’m running n8n Cloud (v2.19.3) in production with @n8n/n8n-nodes-langchain.lmChatAnthropic
(typeVersion 1.5) using Claude Sonnet 4.6. My setup is an AI sales assistant for a car
dealership in Brazil that handles WhatsApp customer conversations 24/7.
The system prompt is ~5,600 tokens and is sent to Anthropic on every customer message.
With prompt caching enabled (cache_control: { type: "ephemeral" }), I would save
~50-70% on input token costs per call.
Currently, the node does NOT expose cache_control — only temperature, topK, topP,
maxTokensToSample, and thinkingMode are available. The underlying @langchain/anthropic
library DOES support cache_control natively, but the n8n wrapper doesn’t expose it.
I confirmed via email with n8n support that there’s no documented or undocumented
workaround for the current node.
I think it would be beneficial to add this because:
- Direct cost reduction (~50-70% on Anthropic input tokens) for any user with
system prompts >1024 tokens — common for AI agent use cases. - The underlying library (
@langchain/anthropic) already supports cache_control,
so implementation on n8n’s side is small (a few lines in message construction). - Anthropic prompt caching has been available since 2024 and is now a standard
feature most production users expect. - n8n Cloud users running AI agents with rich system prompts (rules, examples,
tool descriptions) have no path to reduce these costs without rebuilding agent
logic via raw HTTP Request Node — losing memory/tool orchestration. - Other LLM nodes in the n8n ecosystem could benefit from similar caching
support over time (e.g., OpenAI prompt caching).
Any resources to support this?
- Anthropic prompt caching docs: https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
- LangChain JS ChatAnthropic docs: LangChain Reference Docs
- n8n source for the node: n8n/packages/@n8n/nodes-langchain/nodes/llms/LMChatAnthropic/LmChatAnthropic.node.ts at master · n8n-io/n8n · GitHub
- Pricing comparison (full vs cached input): Plans & Pricing | Claude by Anthropic
Are you willing to work on this?
Not in a position to contribute code (not a TypeScript developer), but happy to:
- Test the feature in production once available
- Provide before/after cost data for the n8n team
- Write user documentation / blog post about the cost savings