**摘要**
Conditioning a language model on additional context, such as feedback on a previous attempt, typically improves its response. Self-distillation trains the model to retain this improvement when the context is not present. The method works by matching the model's output distribution under two settings: a student that sees only the question, and a self-teacher that also sees the context. What the mod
👤 作者: Semih Kara, Oğuzhan Ersoy
---
🔗 **[The Role of Feedback Alignment in Self-Distillation](https://arxiv.org/abs/2606.11173v1)**
> The Role of Feedback Alignment in Self-Distillation
🏷️ 来源: ArXiv cs.AI
⏱️ 2026-06-10 14:00
加载回复中...