Michal Valko

22 papers · Latest: May 6, 2026

Conditional outlier detection for clinical alerting

This paper presents a data-driven method for detecting anomalous patient-management actions in EHRs to alert for potential errors.

2605.05124May 6, 2026

Machine Learning

Bandits attack function optimization

This paper introduces Simultaneous Optimistic Optimization (SOO), a bandit-inspired algorithm for efficient function optimization under budget constraints.

2605.03496May 5, 2026

Machine Learning

Adaptive graph-based algorithms for conditional anomaly detection and semi-supervised learning

Adaptive graph-based algorithms are introduced for semi-supervised learning and conditional anomaly detection, with an online approximation and clinical application.

2605.03495May 5, 2026

Machine Learning

Bandits on graphs and structures

This thesis explores graph and structured bandit problems, addressing practical challenges in sequential decision-making with large action spaces.

2605.03493May 5, 2026

Statistical Machine Learning

Middle-mile logistics through the lens of goal-conditioned reinforcement learning

This paper applies goal-conditioned reinforcement learning and graph neural networks to optimize parcel routing in middle-mile logistics networks.

2605.02461May 4, 2026

Neural & Evolutionary Computing

Evolutionary feature selection for spiking neural network pattern classifiers

This paper extends evolutionary feature selection to JASTAP spiking neural networks, enabling smaller, more robust classifiers for noisy data.

2604.26654Apr 29, 2026

Statistical Machine Learning

Spectral bandits

This paper introduces "spectral bandits," an online learning framework for graph-based problems like recommendations, using smooth payoffs and effective dimension.

2604.25272Apr 28, 2026

Statistical Machine Learning

Online learning with Erdős-Rényi side-observation graphs

This paper introduces two novel algorithms for multi-armed bandits with probabilistic side observations, achieving near-optimal regret bounds for unknown observation rates.

2604.25271Apr 28, 2026

Statistical Machine Learning

Pack only the essentials: Adaptive dictionary learning for kernel ridge regression

SQUEAK is a new algorithm for kernel ridge regression that uses adaptive dictionary learning to achieve efficient Nystrom approximations with reduced space complexity.

2604.22386Apr 24, 2026

Statistical Machine Learning

Pliable rejection sampling

Pliable Rejection Sampling (PRS) learns proposals via kernel estimation, providing i.i.d. samples with high probability and guaranteed acceptance rates.

2604.22385Apr 24, 2026

Statistical Machine Learning

A single algorithm for both restless and rested rotting bandits

RAW-UCB is a novel algorithm that achieves near-optimal regret in both restless and rested rotting bandit settings, unifying previously distinct problems.

2604.21432Apr 23, 2026

Machine Learning

On two ways to use determinantal point processes for Monte Carlo integration

This paper explores and generalizes two determinantal point process (DPP) methods for Monte Carlo integration, offering improved variance rates.

2604.19698Apr 21, 2026

Machine Learning

Planning in entropy-regularized Markov decision processes and games

SmoothCruiser is a new planning algorithm for entropy-regularized MDPs and games, achieving O~(1/epsilon^4) sample complexity.

2604.19695Apr 21, 2026

Machine Learning

Budgeted Online Influence Maximization

This paper introduces a new budgeted framework for online influence maximization, optimizing total campaign cost over influencer count.

2604.19672Apr 21, 2026

Statistical Machine Learning

Adaptive multi-fidelity optimization with fast learning rates

Kometo is a new adaptive multi-fidelity optimization algorithm that achieves fast learning rates and improves upon prior guarantees without needing problem-specific knowledge.

2604.16239Apr 17, 2026

Machine Learning

Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model

This paper establishes sample complexity bounds for learning ε-optimal policies in Stochastic Shortest Path problems, revealing challenges when minimum costs are zero.

2604.16111Apr 17, 2026

Machine Learning

The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback

New algorithms achieve optimal last-iterate convergence rates for uncoupled learning in zero-sum games with bandit feedback, despite inherent challenges.

2604.16087Apr 17, 2026

Machine Learning

Optimal last-iterate convergence in matrix games with bandit feedback using the log-barrier

This paper uses log-barrier regularization to achieve optimal O-tilde(t^{-1/4}) last-iterate convergence in zero-sum matrix games.

2604.15242Apr 16, 2026

Statistical Machine Learning

Best of both worlds: Stochastic & adversarial best-arm identification

This paper introduces an algorithm for best-arm identification that performs optimally in stochastic bandit problems while remaining robust to adversarial rewards.

2604.14860Apr 16, 2026

Machine Learning

Online learning with noisy side observations

This paper introduces a new online learning model with noisy side observations and an efficient, parameter-free algorithm achieving $\widetilde{O}(\sqrt{\alpha^* T})$ regret.

2604.13740Apr 15, 2026

Machine Learning

Spectral Thompson sampling

SpectralTS efficiently solves graph bandit problems by leveraging an effective dimension, achieving comparable regret with improved computational performance.

2604.13739Apr 15, 2026

Artificial Intelligence

The Llama 3 Herd of Models

Llama 3 is a new family of large multilingual foundation models excelling in language, coding, reasoning, and multimodal tasks, rivaling GPT-4 in quality and offering extensive public releases.

2407.21783Jul 31, 2024

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

Conditional outlier detection for clinical alerting

Bandits attack function optimization

Adaptive graph-based algorithms for conditional anomaly detection and semi-supervised learning

Bandits on graphs and structures

Middle-mile logistics through the lens of goal-conditioned reinforcement learning

Evolutionary feature selection for spiking neural network pattern classifiers

Spectral bandits

Online learning with Erdős-Rényi side-observation graphs

Pack only the essentials: Adaptive dictionary learning for kernel ridge regression

Pliable rejection sampling

A single algorithm for both restless and rested rotting bandits

On two ways to use determinantal point processes for Monte Carlo integration

Planning in entropy-regularized Markov decision processes and games

Budgeted Online Influence Maximization

Adaptive multi-fidelity optimization with fast learning rates

Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model

The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback

Optimal last-iterate convergence in matrix games with bandit feedback using the log-barrier

Best of both worlds: Stochastic &amp; adversarial best-arm identification

Online learning with noisy side observations

Spectral Thompson sampling

The Llama 3 Herd of Models

📬 Weekly AI Paper Digest

Best of both worlds: Stochastic & adversarial best-arm identification