Hang Xu

4 papers · Latest: May 12, 2026

OmniNFT: Modality-wise Omni Diffusion Reinforcement for Joint Audio-Video Generation

OmniNFT proposes a novel diffusion RL framework to improve joint audio-video generation by addressing multi-modal challenges like gradient imbalance.

2605.12480May 12, 2026

Computer Vision

SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation

SCOPE is a framework that uses structured decomposition and conditional skill orchestration to maintain semantic commitments for complex text-to-image generation.

2605.08043May 8, 2026

Robotics

Shared Autonomy Assisted by Impedance-Driven Anisotropic Guidance Field

IAGF-SA enhances shared autonomy by allowing robots to intuitively communicate their intent to humans via an impedance-driven physical guidance field.

2605.02410May 4, 2026

Natural Language Processing

Baichuan 2: Open Large-scale Language Models

Baichuan 2 is a series of large-scale, open-source multilingual language models that achieve state-of-the-art performance across general and specialized benchmarks.

2309.10305Sep 19, 2023

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.