ArXiv TLDR

Michal Valko

22 papers · Latest:

Machine Learning

Conditional outlier detection for clinical alerting

This paper presents a data-driven method for detecting anomalous patient-management actions in EHRs to alert for potential errors.

2605.05124
Machine Learning

Bandits attack function optimization

This paper introduces Simultaneous Optimistic Optimization (SOO), a bandit-inspired algorithm for efficient function optimization under budget constraints.

2605.03496
Machine Learning

Adaptive graph-based algorithms for conditional anomaly detection and semi-supervised learning

Adaptive graph-based algorithms are introduced for semi-supervised learning and conditional anomaly detection, with an online approximation and clinical application.

2605.03495
Machine Learning

Bandits on graphs and structures

This thesis explores graph and structured bandit problems, addressing practical challenges in sequential decision-making with large action spaces.

2605.03493
Statistical Machine Learning

Middle-mile logistics through the lens of goal-conditioned reinforcement learning

This paper applies goal-conditioned reinforcement learning and graph neural networks to optimize parcel routing in middle-mile logistics networks.

2605.02461
Neural & Evolutionary Computing

Evolutionary feature selection for spiking neural network pattern classifiers

This paper extends evolutionary feature selection to JASTAP spiking neural networks, enabling smaller, more robust classifiers for noisy data.

2604.26654
Statistical Machine Learning

Spectral bandits

This paper introduces "spectral bandits," an online learning framework for graph-based problems like recommendations, using smooth payoffs and effective dimension.

2604.25272
Statistical Machine Learning

Online learning with Erdős-Rényi side-observation graphs

This paper introduces two novel algorithms for multi-armed bandits with probabilistic side observations, achieving near-optimal regret bounds for unknown observation rates.

2604.25271
Statistical Machine Learning

Pack only the essentials: Adaptive dictionary learning for kernel ridge regression

SQUEAK is a new algorithm for kernel ridge regression that uses adaptive dictionary learning to achieve efficient Nystrom approximations with reduced space complexity.

2604.22386
Statistical Machine Learning

Pliable rejection sampling

Pliable Rejection Sampling (PRS) learns proposals via kernel estimation, providing i.i.d. samples with high probability and guaranteed acceptance rates.

2604.22385
Statistical Machine Learning

A single algorithm for both restless and rested rotting bandits

RAW-UCB is a novel algorithm that achieves near-optimal regret in both restless and rested rotting bandit settings, unifying previously distinct problems.

2604.21432
Machine Learning

On two ways to use determinantal point processes for Monte Carlo integration

This paper explores and generalizes two determinantal point process (DPP) methods for Monte Carlo integration, offering improved variance rates.

2604.19698
Machine Learning

Planning in entropy-regularized Markov decision processes and games

SmoothCruiser is a new planning algorithm for entropy-regularized MDPs and games, achieving O~(1/epsilon^4) sample complexity.

2604.19695
Machine Learning

Budgeted Online Influence Maximization

This paper introduces a new budgeted framework for online influence maximization, optimizing total campaign cost over influencer count.

2604.19672
Statistical Machine Learning

Adaptive multi-fidelity optimization with fast learning rates

Kometo is a new adaptive multi-fidelity optimization algorithm that achieves fast learning rates and improves upon prior guarantees without needing problem-specific knowledge.

2604.16239
Machine Learning

Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model

This paper establishes sample complexity bounds for learning ε-optimal policies in Stochastic Shortest Path problems, revealing challenges when minimum costs are zero.

2604.16111
Machine Learning

The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback

New algorithms achieve optimal last-iterate convergence rates for uncoupled learning in zero-sum games with bandit feedback, despite inherent challenges.

2604.16087
Machine Learning

Optimal last-iterate convergence in matrix games with bandit feedback using the log-barrier

This paper uses log-barrier regularization to achieve optimal O-tilde(t^{-1/4}) last-iterate convergence in zero-sum matrix games.

2604.15242
Statistical Machine Learning

Best of both worlds: Stochastic & adversarial best-arm identification

This paper introduces an algorithm for best-arm identification that performs optimally in stochastic bandit problems while remaining robust to adversarial rewards.

2604.14860
Machine Learning

Online learning with noisy side observations

This paper introduces a new online learning model with noisy side observations and an efficient, parameter-free algorithm achieving $\widetilde{O}(\sqrt{\alpha^* T})$ regret.

2604.13740
Machine Learning

Spectral Thompson sampling

SpectralTS efficiently solves graph bandit problems by leveraging an effective dimension, achieving comparable regret with improved computational performance.

2604.13739
Artificial Intelligence

The Llama 3 Herd of Models

Llama 3 is a new family of large multilingual foundation models excelling in language, coding, reasoning, and multimodal tasks, rivaling GPT-4 in quality and offering extensive public releases.

2407.21783

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.