Yan Li
3 papers ยท Latest:
Computer Vision
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture
SenseNova-U1 introduces a unified architecture (NEO-unify) that seamlessly integrates multimodal understanding and generation, outperforming specialized VLMs.
2605.12500
Information RetrievalFrom Trajectories to Phenotypes: Disease Progression as Structural Priors for Multi-organ Imaging Representation Learning
A new framework distills disease trajectory knowledge into imaging models, significantly improving disease prediction, especially for rare conditions.
2605.11958
Computer VisionMM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation
MM-WebAgent is a hierarchical multimodal agent that generates coherent and visually consistent webpages by coordinating AIGC elements through planning and self-reflection.
2604.15309
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.