Shukang Yin
3 papers ยท Latest:
Natural Language Processing
SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation
SpeechParaling-Bench is a new benchmark for evaluating paralinguistic-aware speech generation in LALMs, using fine-grained features and a novel LALM-based judge.
2604.20842
Computer VisionTango: Taming Visual Signals for Efficient Video Large Language Models
Tango optimizes token pruning in Video LLMs by improving attention selection and similarity clustering, achieving significant speedup with minimal performance loss.
2604.09547
Computer VisionA Survey on Multimodal Large Language Models
This paper surveys recent advances in Multimodal Large Language Models (MLLMs), highlighting their architectures, training, capabilities, and future research directions.
2306.13549
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.