I’ve seen this on other models as well, but on deepseek I can’t ever seem to get it to work, on both 14b and 32b. Running this on aws, right now using a g5.2xlarge.
I don’t have any kind of useful error message. The one thing I do notice is the model appears to be reloading itself while processing a prompt. The model loads without problems based on the ollama logs, and I can use the model just fine in the ollama open-webui, so I believe ollama is fine.
I’ve narrowed this issue down to the Output Format option being set to json. When I remove it, it has output, but it fails to parse because of the think text. If I set it to json, then it doesn’t get the complete response.