**摘要**
当训练后的语言模型在推理问题上失败时,常见的测试时间缩放响应是在额外的尝试上花费更多的计算,并且失败的痕迹不再发挥作用。我们认为这放弃了一个关键的信号;一些失败来自不幸的抽样,更多的推出有所帮助,而另一些则是结构性的,无论预算如何,都会抵制重新抽样。我们建议失败的TRA
👤 作者: Nizar Islah, Istabrak Abbes, Irina Rish, Sarath Chandar, Eilif B. Muller
---
🔗 **[Failed Reasoning Traces Tell You What Is Fixable (But Not by Reading Them)](https://arxiv.org/abs/2606.05145v1)**
> Failed Reasoning Traces Tell You What Is Fixable (But Not by Reading Them)
🏷️ 来源: ArXiv cs.AI
⏱️ 2026-06-04 14:01
news
Failed Reasoning Traces Tell You What Is Fixable (But Not by Reading Them)
加载回复中...