Hello In my current workflow, the AI categorizes problems reported by customers. However, the AI often miscategorizes these problems, and I’d like it to be more accurate. So, I’d like to feed it knowledge. I was thinking about adding data from the past with the description for example: this is a problem that concerns the first category, and this is a problem that concerns the second category Would that work? If so, how can it be done?
Have you considered RAG?
My approximate flow is the following:
1 - I have definitions for each category that I want to classify them to. Make them relatively detailed and with examples.
2 - I have a backtesting system that when I change something with the definitions, I can rerun it on past classified texts. I can take a look at how the prompt change affected it so I can experiment with changes quicker.
Also: model choice matters a lot! As well as the amount of reasoning token you give them. If you have the budget, I currently suggest 4.6 Opus for classification. Deepseek R1 for price-to-value.
Great question! There are actually a few solid approaches here:
1. Few-Shot Prompting (simplest)
Add example category definitions directly to your system prompt. E.g., “Here are examples of Category A: [examples]. Category B: [examples].” This works surprisingly well and costs nothing extra.
2. RAG (most flexible)
As @kjooleng mentioned, store your past classifications + definitions in a vector DB and inject relevant examples dynamically. This scales when you have lots of edge cases.
3. Fine-tuning (best accuracy, not always needed)
If you have 100+ examples, fine-tuning a smaller model (like GPT-4o mini) can be worth it. But Opus with few-shot often outperforms, and it’s cheaper to just upgrade the model than fine-tune.
Milan makes a key point about model choice — Opus really does shine here. And adding reasoning tokens (o1-style thinking) helps catch edge cases. We’ve found that combining detailed category definitions + good examples in the prompt beats most other approaches.
Which route are you leaning toward?