Junbin Xiao
2 papers ยท Latest:
Computer Vision
Audio-Visual Intelligence in Large Foundation Models
This survey provides the first comprehensive review of Audio-Visual Intelligence (AVI) in large foundation models, unifying tasks, methods, and challenges.
2605.04045
Computer VisionEgo-Grounding for Personalized Question-Answering in Egocentric Videos
This paper introduces MyEgo, a new egocentric video QA dataset, revealing that current MLLMs struggle with personalized ego-grounding and long-term memory.
2604.01966
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.