What are the key differences between training a large language model from scratch and fine-tuning a pre-trained model for a specific task

I’m trying to understand the trade-offs between building a language model from the ground up versus adapting an existing one (like GPT or BERT) for a domain-specific application. What are the main considerations in terms of data requirements, computational cost, performance, and flexibility