Jie Huang
4 papers ยท Latest:
Computer Vision
OmniNFT: Modality-wise Omni Diffusion Reinforcement for Joint Audio-Video Generation
OmniNFT proposes a novel diffusion RL framework to improve joint audio-video generation by addressing multi-modal challenges like gradient imbalance.
2605.12480
Computer VisionSCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation
SCOPE is a framework that uses structured decomposition and conditional skill orchestration to maintain semantic commitments for complex text-to-image generation.
2605.08043
Computer VisionIR-Flow: Bridging Discriminative and Generative Image Restoration via Rectified Flow
IR-Flow unifies discriminative and generative image restoration using Rectified Flow for efficient, high-quality results across various degradations.
2604.19680
Statistical Machine LearningSharp description of local minima in the loss landscape of high-dimensional two-layer ReLU neural networks
This paper sharply describes local minima in two-layer ReLU networks' loss landscape, linking them to SGD dynamics and revealing a hierarchical structure.
2604.09412
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.