Describe the problem/error/question
I have a list of text such as the following:
[
{
"ID": 1234,
"Text": "sample text 1",
},
{
"ID": 5678,
"Text": "sample text 2",
},
]
I also have a list of categories such as the following:
[
{
"Code": "ABC",
"Description": "Sample Description 1",
},
{
"Code": "MNO",
"Description": "Sample Description 2",
},
{
"Code": "XYZ",
"Description": "Sample Description 3",
},
]
The end goal is to assign the best suited category to each text, persisting ID and Code through the process for later usage.
Sample output assuming that the LLM determines that categories 2 and 1 are the best matches for text 1234 and 5678 respectively:
[
{
"ID": 1234,
"Text": "sample text 1",
"Assigned Code": "MNO",
"Assigned Description": "Sample Description 2",
},
{
"ID": 5678,
"Text": "sample text 2",
"Assigned Code": "ABC",
"Assigned Description": "Sample Description 1",
}
]
While I am aware of the text classification node, I would like to implement a different approach using pair-wise comparison and an LLM.
Essentially, we start by assigning the first category to every text. Then we select the first text and ask the LLM the question:
For this {{Text}} is Category 2 a better fit than {{Assigned Description}}?
If the answer is yes, we update the assigned category to category 2, if the answer is no, we keep category 1 as the assigned category. We then compare again, the assigned category vs Category 3. We repeat this until all categories have been covered. Then we repeat this same process with the next Text entry.
Just in case it helps someone, a rough Python example would look somewhat like this:
# Assume texts holds the JSON for texts to classify and categories the one for categories
for text_entry in texts:
# start by setting the first category as the assigned one
text_entry["Assigned Code"] = categories[0]["Code"]
text_entry["Assigned Description"] = categories[0]["Description"]
for category in categories[1:]:
# iterate over the remaining categories and update the assigned one when a better one is found
LLM_comparison = LLM_call(f"Is the category {category["Description"]} a better fit than the category {text_entry["Assigned Description"]} for the text '{text_entry["text"]}'?") # this returns a boolean
if LLM_comparison == True:
text_entry["Assigned Code"] = category["Code"]
text_entry["Assigned Description"] = category["Description"]
else:
continue
I am really struggling to translate this to n8n terms. I am not really able to make the nested loops work and I also do not really not how to keep track of the latest assigned category and code of each text…
Any help would be much appreciated.
Information on your n8n setup
- n8n version: 1.91.3
- Database (default: SQLite): default: SQLite
- n8n EXECUTIONS_PROCESS setting (default: own, main): default: own, main
- Running n8n via (Docker, npm, n8n cloud, desktop app): n8n cloud
- Operating system: Windows 11