ArXiv TLDR

ALAS: Adaptive Long-Horizon Action Synthesis via Async-pathway Stream Disentanglement

🐦 Tweet
2604.20721

Yutong Shen, Hangxu Liu, Lei Zhang, Penghui Liu, Yinqi Liu + 2 more

cs.RO

TLDR

ALAS uses dual-stream disentanglement for long-horizon human-scene interaction tasks, improving success and efficiency across domains.

Key contributions

  • ALAS framework for long-horizon human-scene interaction tasks via dual-stream disentanglement.
  • Environment module for spatial understanding enables cross-domain transfer.
  • Skill module for task execution enables cross-skill transfer via motor pattern encoding.
  • Improves subtask success rate by 23% and execution efficiency by 29%.

Why it matters

Existing methods for long-horizon tasks struggle with generalization across environments and skills due to tight coupling. ALAS addresses this by disentangling environment and self-state, allowing it to generalize better to new combinations. This significantly improves task success and efficiency in complex human-scene interactions.

Original Abstract

Long-Horizon (LH) tasks in Human-Scene Interaction (HSI) are complex multi-step tasks that require continuous planning, sequential decision-making, and extended execution across domains to achieve the final goal. However, existing methods heavily rely on skill chaining by concatenating pre-trained subtasks, with environment observations and self-state tightly coupled, lacking the ability to generalize to new combinations of environments and skills, failing to complete various LH tasks across domains. To solve this problem, this paper presents ALAS, a cross-domain learning framework for LH tasks via biologically inspired dual-stream disentanglement. Inspired by the brain's "where-what" dual pathway mechanism, ALAS comprises two core modules: i) an environment learning module for spatial understanding, which captures object functions, spatial relationships, and scene semantics, achieving cross-domain transfer through complete environment-self disentanglement; ii) a skill learning module for task execution, which processes self-state information including joint degrees of freedom and motor patterns, enabling cross-skill transfer through independent motor pattern encoding. We conducted extensive experiments on various LH tasks in HSI scenes. Compared with existing methods, ALAS can achieve an average subtasks success rate improvement of 23\% and average execution efficiency improvement of 29\%.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.