Yilun Chen
3 papers · Latest:
Robotics
VistaBot: View-Robust Robot Manipulation via Spatiotemporal-Aware View Synthesis
VistaBot enhances robot manipulation's view robustness by combining geometric models with video diffusion for closed-loop control.
2604.21914
RoboticsPokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance
PokeVLA is a lightweight Vision-Language-Action model that improves robot manipulation by integrating comprehensive world knowledge and spatial awareness.
2604.20834
RoboticsStarVLA-$α$: Reducing Complexity in Vision-Language-Action Systems
StarVLA-$α$ simplifies Vision-Language-Action systems, offering a strong baseline that reduces complexity and achieves competitive performance across diverse benchmarks.
2604.11757
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.