Wenping Wang
2 papers ยท Latest:
Robotics
Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration
QOED improves robot exploration by adaptively identifying and prioritizing observable parameter directions, suppressing nuisance effects for better learning.
2605.12084
Statistical Machine LearningDDO-RM for LLM Preference Optimization: A Minimal Held-Out Benchmark against DPO
DDO-RM, a new method for LLM preference optimization, shows improved performance over DPO on a minimal held-out benchmark using reward-guided updates.
2604.11119
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.