**摘要**
Training reinforcement learning (RL) policies from scratch is costly: it requires careful reward and environment design, extensive tuning, and substantial computation. Yet many control problems already have a functional but suboptimal policy available as a baseline. This paper proposes a method for embedding such a baseline into the RL training process, simultaneously improving trainin
👤 作者: Anton Bolychev, Georgiy Malaniya, Sinan Ibrahim, Pavel Osinenko
---
🔗 **[An Agency-Transferring Model-Free Policy Enhancement Technique](https://arxiv.org/abs/2606.09825v1)**
> An Agency-Transferring Model-Free Policy Enhancement Technique
🏷️ 来源: ArXiv cs.AI
⏱️ 2026-06-09 14:00
news
An Agency-Transferring Model-Free Policy Enhancement Technique
加载回复中...