ArXiv TLDR

Hao Wu

10 papers ยท Latest:

Cryptography & Security

Usability as a Weapon: Attacking the Safety of LLM-Based Code Generation via Usability Requirements

This paper introduces UPAttack, demonstrating how usability requirements can force LLMs to generate insecure code, achieving up to 98.1% attack success.

2605.10133
Robotics

MemCompiler: Compile, Don't Inject -- State-Conditioned Memory for Embodied Agents

MemCompiler dynamically compiles state-conditioned memory for embodied agents, improving performance and efficiency over static memory injection.

2605.07594
Robotics

RoboAlign-R1: Distilled Multimodal Reward Alignment for Robot Video World Models

RoboAlign-R1 improves robot video world models by using reward-aligned post-training and stabilized long-horizon inference, boosting task consistency and realism.

2605.03821

Observation of attractor transitions in active magnon-polaritons under microwatt drives

Active magnon-polaritons enable low-power observation of attractor transitions, explosive bistability, and chaotic dynamics for new microwave applications.

2604.27668
Robotics

STARRY: Spatial-Temporal Action-Centric World Modeling for Robotic Manipulation

STARRY is a novel world model for robotic manipulation that aligns spatial-temporal prediction with action generation for improved task success.

2604.26848
Artificial Intelligence

AblateCell: A Reproduce-then-Ablate Agent for Virtual Cell Repositories

AblateCell is an AI agent that reproduces baselines and performs systematic ablations on virtual cell repositories to identify critical components.

2604.19606
Cryptography & Security

SAGE: Signal-Amplified Guided Embeddings for LLM-based Vulnerability Detection

SAGE introduces Signal-Amplified Guided Embeddings to overcome "Signal Submersion" in LLM-based vulnerability detection, achieving SOTA performance.

2604.19031
Natural Language Processing

C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts

C-ReD is a new Chinese benchmark for detecting AI-generated text, improving diversity and generalization over prior datasets.

2604.11796
Robotics

Stop Wandering: Efficient Vision-Language Navigation via Metacognitive Reasoning

MetaNav improves Vision-Language Navigation efficiency and robustness using metacognitive reasoning, reducing redundant exploration and VLM queries.

2604.02318
Natural Language Processing

Gemini: A Family of Highly Capable Multimodal Models

Gemini is a new family of multimodal AI models excelling in image, audio, video, and text understanding, achieving state-of-the-art results across numerous benchmarks including human-expert level on MMLU.

2312.11805

๐Ÿ“ฌ Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week โ€” summarized, scored, and delivered to your inbox every Monday.