How to Feed AI with Data to better categorize responses

Hello In my current workflow, the AI ​​categorizes problems reported by customers. However, the AI ​​often miscategorizes these problems, and I’d like it to be more accurate. So, I’d like to feed it knowledge. I was thinking about adding data from the past with the description for example: this is a problem that concerns the first category, and this is a problem that concerns the second category Would that work? If so, how can it be done?

Have you considered RAG?

My approximate flow is the following:

1 - I have definitions for each category that I want to classify them to. Make them relatively detailed and with examples.
2 - I have a backtesting system that when I change something with the definitions, I can rerun it on past classified texts. I can take a look at how the prompt change affected it so I can experiment with changes quicker.

Also: model choice matters a lot! As well as the amount of reasoning token you give them. If you have the budget, I currently suggest 4.6 Opus for classification. Deepseek R1 for price-to-value.

Great question! There are actually a few solid approaches here:

1. Few-Shot Prompting (simplest)
Add example category definitions directly to your system prompt. E.g., “Here are examples of Category A: [examples]. Category B: [examples].” This works surprisingly well and costs nothing extra.

2. RAG (most flexible)
As @kjooleng mentioned, store your past classifications + definitions in a vector DB and inject relevant examples dynamically. This scales when you have lots of edge cases.

3. Fine-tuning (best accuracy, not always needed)
If you have 100+ examples, fine-tuning a smaller model (like GPT-4o mini) can be worth it. But Opus with few-shot often outperforms, and it’s cheaper to just upgrade the model than fine-tune.

Milan makes a key point about model choice — Opus really does shine here. And adding reasoning tokens (o1-style thinking) helps catch edge cases. We’ve found that combining detailed category definitions + good examples in the prompt beats most other approaches.

Which route are you leaning toward?

2 Likes