I think your best bet is to try something else than Llama as they seem to not work very well with structured output (which is required for this task)
See some examples here - All via openrouter, but some of these are definitely available on ollama and also depends on your hardware. I tried to use small models for local performace.
I’ve been using Ollama 3.2 locally on a 3yr old laptop with amd-gpu.
For some things, it seems to be working ok.
Now, I’ve tried gemma 2-9 and it’s working directly. Will try and see whether my laptop can handle Ollama 8b and if not, go with the online versions.
I’d like to have it running offline, but I recognize my hardware limitations as well.