Jun Li

4 papers · Latest: May 13, 2026

RealICU: Do LLM Agents Understand Long-Context ICU Data? A Benchmark Beyond Behavior Imitation

RealICU is a new benchmark for evaluating LLM agents on long-context ICU data, revealing recall-safety tradeoffs and anchoring biases in existing models.

2605.13542May 13, 2026

Computer Vision

Revisiting Uncertainty: On Evidential Learning for Partially Relevant Video Retrieval

Holmes introduces a hierarchical evidential learning framework to explicitly model and quantify uncertainty in partially relevant video retrieval, outperforming SOTA.

2605.06083May 7, 2026

Computer Vision

Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration

TICoE is a text-image collaborative framework that precisely erases undesirable concepts from generative models while preserving content fidelity.

2604.15829Apr 17, 2026

Cryptography & Security

Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling

SET detects backdoors in T2I models by scaling cross-attention, revealing response divergence between benign and malicious inputs, outperforming prior methods.

2604.12446Apr 14, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.