**摘要**
Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute. We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a
👤 作者: 徐欧阳, Deyi Liu, Yuhang Cai, Jing Liu, Yuan Yang, Chen Zheng, Thomas Hartvigsen, 马怡媛
---
🔗 **[LLMs作为嘈杂的渠道:从Shannon的角度看待模型产能和扩展定律](https://arxiv.org/abs/2605.23901v1)**
> LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws
🏷️ 来源: ArXiv cs.AI
⏱️ 2026-05-26 08:00
加载回复中...