**摘要**
High-quality 3D scene reconstruction has recently advanced toward generalizable feed-forward architectures, enabling the generation of complex environments in a single forward pass. However, despite their strong performance in static scene perception, these models remain limited in responding to dynamic human instructions, which restricts their use in interactive applications. Existing editing met
👤 作者: Kaixin Zhu, Yiwen Tang, Yifan Yang, Renrui Zhang, Bohan Zeng, Ziyu Guo, Ruichuan An, Zhou Liu, Qizhi Chen, Delin Qu, Jaehong Yoon, Wentao Zhang
---
🔗 **[VGGT-Edit :使用残差场预测的前馈原生3D场景编辑](https://arxiv.org/abs/2605.15186v1)**
> VGGT-Edit: Feed-forward Native 3D Scene Editing with Residual Field Prediction
🏷️ 来源: ArXiv cs.AI
⏱️ 2026-05-16 08:01
加载回复中...