I am using this workflow to receive message and audio from my site, then use then to generate a response and send back to my site. But if i ask something like:“Do a resume of all the archives you have” the response takes 20 or more seconds and i want to reduce this time.
Hi i know what you want to do because i also have tried to optimize it and nothing you cant do because it will depends how much token you generate each request, you can analize which node generate the most latency
I need to use memory to response and basically all the time is importing memory from postgres, and using the memory to formulate my response, i dont know how to solve this but i will try, thanks.