Wenping Wang

2 papers · Latest: May 12, 2026

Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration

QOED improves robot exploration by adaptively identifying and prioritizing observable parameter directions, suppressing nuisance effects for better learning.

2605.12084May 12, 2026

Statistical Machine Learning

DDO-RM for LLM Preference Optimization: A Minimal Held-Out Benchmark against DPO

DDO-RM, a new method for LLM preference optimization, shows improved performance over DPO on a minimal held-out benchmark using reward-guided updates.

2604.11119Apr 13, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.