Tao Feng

4 papers · Latest: May 11, 2026

MMVIAD: Multi-view Multi-task Video Understanding for Industrial Anomaly Detection

Introduces MMVIAD, the first continuous multi-view video dataset for industrial anomaly detection, and VISTA, a model outperforming GPT-5.4.

2605.10833May 11, 2026

Computer Vision

SpanVLA: Efficient Action Bridging and Learning from Negative-Recovery Samples for Vision-Language-Action Model

SpanVLA is an autonomous driving framework that combines efficient action planning with learning from negative-recovery samples to improve robustness and reduce latency.

2604.19710Apr 21, 2026

Computer Vision

CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation

CoInteract synthesizes physically consistent human-object interaction videos, improving structural stability and contact realism using a DiT with specialized experts.

2604.19636Apr 21, 2026

Natural Language Processing

Thought-Retriever: Don't Just Retrieve Raw Data, Retrieve Thoughts for Memory-Augmented Agentic Systems

Thought-Retriever enables LLMs to overcome context limits by retrieving and organizing past intermediate responses ('thoughts') for self-evolving long-term memory.

2604.12231Apr 14, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.