MAIN WORKFLOW_SEVERIN PLUS.json (71.9 KB)
My restaurant bot (Severin Plus) is experiencing high latency and “double-text” errors. The current architecture is a linear synchronous flow:
- WhatsApp Trigger → Duplicate Filter → HTTP Store Status Check → HTTP Typing Indicator → AI Agent (Gemini 1.5 Flash + 6 Tool sub-workflows) → Send Message.
Main Issues:
- Webhook Retry Bug: Users often have to send a message twice. I believe the linear flow (multiple HTTP requests and tool calls) exceeds Meta’s 5-second webhook timeout, causing it to retry the message because a
200 OKwasn’t sent fast enough. - High Latency: Every sub-workflow tool adds execution overhead. The sequential nature makes the bot too slow for live service.
- Concurrency/Hallucinations: The AI gets confused with simultaneous users. I suspect Simple Memory (5-message window) is failing to handle concurrent sessions reliably.
Request for Guidance: I want to move to a production-ready Worker/Queue pattern. Specifically, I need advice on:
- Decoupling Response: How to immediately respond with a
200 OKand handle the AI logic asynchronously in the background. - Parallelization: How to run the status check and typing indicator simultaneously.
- Persistent Memory: Best practices for moving from Simple Memory to Postgres/Redis for high concurrency.
Note: My workflow JSON was too large to paste; I have attached the file to this post. Any help would be appreciated!