Jose Blanchet
3 papers ยท Latest:
Machine Learning
Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback
This paper introduces Wasserstein Distributionally Robust Regret Optimization (DRRO) for RLHF to mitigate reward over-optimization, offering a less pessimistic approach.
2605.00155
Classical and Quantum Speedups for Non-Convex Optimization via Energy Conserving Descent
New stochastic and quantum Energy Conserving Descent algorithms achieve exponential speedups over gradient descent for non-convex optimization.
2604.13022
Partial Identification of Policy-Relevant Treatment Effects with Instrumental Variables via Optimal Transport
This paper uses optimal transport to derive sharper bounds for policy-relevant treatment effects, improving identification with instrumental variables.
2604.12263
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.