ArXiv TLDR

Jie Zhou

7 papers ยท Latest:

Computer Vision

BAMI: Training-Free Bias Mitigation in GUI Grounding

BAMI is a training-free method that uses coarse-to-fine focus and candidate selection to mitigate precision and ambiguity biases in GUI grounding models.

2605.06664

UVMarvel: an Automated LLM-aided UVM Machine for Subsystem-level RTL Verification

UVMarvel automates UVM testbench creation for subsystem-level RTL verification using LLMs, significantly boosting efficiency and coverage.

2605.04704
Artificial Intelligence

Action-Aware Generative Sequence Modeling for Short Video Recommendation

A2Gen improves short video recommendations by modeling user actions as temporal sequences, leading to significant engagement boosts.

2604.25834
Computer Vision

UniGenDet: A Unified Generative-Discriminative Framework for Co-Evolutionary Image Generation and Generated Image Detection

UniGenDet is a unified generative-discriminative framework that co-evolves image generation and detection, achieving state-of-the-art performance.

2604.21904
Computer Vision

Do MLLMs Understand Pointing? Benchmarking and Enhancing Referential Reasoning in Egocentric Vision

This paper introduces EgoPoint-Bench to benchmark and enhance MLLMs' understanding of pointing gestures in egocentric vision, addressing "Referential Hallucination."

2604.21461
Natural Language Processing

Ask Only When Needed: Proactive Retrieval from Memory and Skills for Experience-Driven Lifelong Agents

ProactAgent enables lifelong learning agents to proactively retrieve relevant experience and skills only when needed, improving performance and efficiency.

2604.20572
Robotics

ShapeGen: Robotic Data Generation for Category-Level Manipulation

ShapeGen generates diverse robotic manipulation data in 3D to improve category-level generalizability without requiring a simulator.

2604.15569

๐Ÿ“ฌ Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week โ€” summarized, scored, and delivered to your inbox every Monday.