Tao Feng
4 papers ยท Latest:
MMVIAD: Multi-view Multi-task Video Understanding for Industrial Anomaly Detection
Introduces MMVIAD, the first continuous multi-view video dataset for industrial anomaly detection, and VISTA, a model outperforming GPT-5.4.
SpanVLA: Efficient Action Bridging and Learning from Negative-Recovery Samples for Vision-Language-Action Model
SpanVLA is an autonomous driving framework that combines efficient action planning with learning from negative-recovery samples to improve robustness and reduce latency.
CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation
CoInteract synthesizes physically consistent human-object interaction videos, improving structural stability and contact realism using a DiT with specialized experts.
Thought-Retriever: Don't Just Retrieve Raw Data, Retrieve Thoughts for Memory-Augmented Agentic Systems
Thought-Retriever enables LLMs to overcome context limits by retrieving and organizing past intermediate responses ('thoughts') for self-evolving long-term memory.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.