Jun Li
4 papers ยท Latest:
RealICU: Do LLM Agents Understand Long-Context ICU Data? A Benchmark Beyond Behavior Imitation
RealICU is a new benchmark for evaluating LLM agents on long-context ICU data, revealing recall-safety tradeoffs and anchoring biases in existing models.
Revisiting Uncertainty: On Evidential Learning for Partially Relevant Video Retrieval
Holmes introduces a hierarchical evidential learning framework to explicitly model and quantify uncertainty in partially relevant video retrieval, outperforming SOTA.
Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration
TICoE is a text-image collaborative framework that precisely erases undesirable concepts from generative models while preserving content fidelity.
Scaling Exposes the Trigger: Input-Level Backdoor Detection in Text-to-Image Diffusion Models via Cross-Attention Scaling
SET detects backdoors in T2I models by scaling cross-attention, revealing response divergence between benign and malicious inputs, outperforming prior methods.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.