After the Basic LLM chain finishes execution, it returns only the “output” field containing a text response. It would be helpful to have the flexibility to choose whether to receive the answer or the full response, which could include additional information like token usage.
For example, the full response look like this:
[
{
"response": {
"generations": [
[
{
"text": "```json\n{\"output\":[{\"text\":\"Tracking\"}]}\n```",
"generationInfo": {
"finish_reason": "stop"
}
}
]
]
},
"tokenUsage": {
"completionTokens": 13,
"promptTokens": 2466,
"totalTokens": 2479
}
}
]
This would provide the option to select between a simple answer or detailed data, including token metrics.
The main goal is to have the ability to track token usage and investigate cases where more tokens than usual are consumed.