Hey guys, please, if anyone can advise me — I want to create an agent for data analysis that can do ETL processes, pivot tables, and all that. I’m currently in the analysis phase of my workflow. I’m passing data to my agent (I’m using Gemini because it’s free), and when I ask questions about the data, sometimes the agent answers correctly, sometimes not. I think it’s because the LLM isn’t optimal. Are there other powerful and free LLMs for data science? Please help me!
Hey @Hanen_Goubaa hope your day is going well!
I think if Gemini is not 100% right 100% of the time - it is expected, LLM is not an Almighty, all knowing being, but a complex mathematical system, which can guess which letter goes after which in order to compile a somewhat relevant answer.
The biggest problem in my experience with getting the right answer from LLM is not that it isn’t smart enough, but rather that it wasn’t asked the right question (wasn’t explained what the expected answer is or could be).
Let me give you an example. I’ve seen many posts where people would say “the AI agent is not working very well”. Then I would look at their prompts and it says something like “Here is text: “…text…”. Make it better.” This is an example (most simplistic example) of a terrible prompt which will result in an unpredictable output.
I am not saying this is what you are doing, but I strongly suggest you try to nail the questions first, before shopping for a different model.
Thank you very much @jabbson , you said everything is correct, and I well noted that the prompt and the way we ask the question play a crucial role in the results. I have always tried to modify my prompts and found many different answers — sometimes correct, sometimes not, sometimes incomplete, and sometimes no answer at all. But really, what I’m looking for is to minimize the error rate because I’m working on a critical project.
So honestly, my agent is not working properly even though I try to give it the data in the right format, with a good, simple, and detailed prompt, and proper guidance. Yet, there are catastrophic errors in the responses when processing data, and I don’t know what to do.
Further massaging and refining the prompt is in a lot of cases all we can do really.
If some information is missing, we can instruct LLM to provide an output in a format and explicitly tell it which information we expect back. If some information is not correct, we can ask it to validate the response before it answers or we can provide examples to set a baseline. In n8n, we would sometimes even create a second agent to analyze, verify / validate the answer provided by the first one.
Generally speaking, for ETL, LLM chat may not even be the best approach to begin with, especially if the task that we ask of it is rather complex.
That being said, you could try Claude, it has a limited free tier, see if it gives you better results / experience.
I’m thinking of using Ollama locally because I tried Claude on a small test, and it’s not better than Gemini. I’m looking for data agent templates, but I can’t find any. A template could help me spot something that might be missing in my agent or my workflow, and I could get inspired from it. But so far, I haven’t found anything interesting. If you have some, please share them with me. Thanks!
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.