Lorenzo Vaquero
2 papers ยท Latest:
Computer Vision
Training-Free Semantic Multi-Object Tracking with Vision-Language Models
TF-SMOT introduces a training-free semantic multi-object tracking pipeline using pre-trained vision-language models for improved video summaries and captions.
2604.14074
Computer VisionTowards Unconstrained Human-Object Interaction
This paper introduces Unconstrained Human-Object Interaction (U-HOI), leveraging MLLMs to detect interactions without predefined vocabularies.
2604.14069
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.