Lorenzo Vaquero

2 papers · Latest: April 15, 2026

Training-Free Semantic Multi-Object Tracking with Vision-Language Models

TF-SMOT introduces a training-free semantic multi-object tracking pipeline using pre-trained vision-language models for improved video summaries and captions.

2604.14074Apr 15, 2026

Computer Vision

Towards Unconstrained Human-Object Interaction

This paper introduces Unconstrained Human-Object Interaction (U-HOI), leveraging MLLMs to detect interactions without predefined vocabularies.

2604.14069Apr 15, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.