https://arxiv.org/abs/2310.06694 Sheared LLaMA: Accelerating Language Model Pre-training via Structured PruningThe popularity of LLaMA (Touvron et al., 2023a;b) and other recently emerged moderate-sized large language models (LLMs) highlights the potential of building smaller yet powerful LLMs. Regardless, the cost of training such models from scratch on trillionsarxiv.orgLLaMA와 같은 모델들은 작지만 강력한 ..