**摘要**
Robot manipulation alternates between low-risk transit phases that call for fast execution and high-risk contact stages that demand slow, precise motion. Yet existing Vision-Language-Action models (VLAs) only inherit a single fixed speed from training demonstrations. Prior efforts to accelerate VLAs through model compression, KV-cache reuse, or reinforcement learning only shift the policy from one
👤 作者: Dong Jing, Jingchen Nie, Tianqi Zhang, Jiaqi Liu, Huaxiu Yao, Zhiwu Lu, Mingyu Ding
---
🔗 **[TempoVLA :学习速度可控的视觉-语言-行动政策](https://arxiv.org/abs/2606.06491v1)**
> TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies
🏷️ 来源: ArXiv cs.AI
⏱️ 2026-06-05 14:00
加载回复中...