Robust Mutation Analysis of Quantum Programs Under Noise
Sophie Fortz, Eñaut Mendiluze Usandizaga, Shaukat Ali, Paolo Arcaini, Mohammad Reza Mousavi
TLDR
This paper empirically studies noise-aware mutation analysis for quantum programs, showing noise significantly impacts mutant detection.
Key contributions
- Empirically analyzes noise impact on mutant detection using 41 quantum programs on noisy simulators.
- Shows noise significantly alters behavioral distance, making equivalent mutants harder to distinguish.
- Compares distance metrics, finding density-matrix metrics best, with practical output-distribution alternatives.
- Demonstrates noise-specific thresholds improve detection and noise effects correlate with circuit characteristics.
Why it matters
Existing mutation analysis for quantum programs overlooks critical hardware noise. This paper highlights the urgent need to adapt quantum program comparison and mutation analysis to the specific noise profiles of target quantum devices, improving testing robustness.
Original Abstract
Mutation analysis has long been used in classical software testing and has recently been adopted for assessing the robustness of quantum software testing techniques. However, existing studies assume ideal, noiseless execution, overlooking the impact of quantum hardware noise. In this paper, we present an empirical study of noise-aware mutation analysis for quantum programs. We analyze how noise affects mutant detection using 41 quantum programs, executed on noiseless and noisy simulators emulating three IBM devices with different noise profiles. We compare several distance metrics and thresholding strategies to evaluate mutant detection under realistic noise. Our results show that noise significantly alters the behavioral distance between programs and mutants, making equivalent mutants harder to distinguish from real faults. Density-matrix metrics achieve the best discrimination, with misclassification rates up to 16.77%, but are not accessible on real hardware. Among practical alternatives, output-distribution metrics reach up to 73.03% accuracy and 74.89% F1-score. Noise-specific thresholds further improve detection compared to noiseless thresholds. We also find that noise effects correlate more with algorithm and circuit characteristics than with mutation types. Overall, our results highlight the need to adapt mutation analysis, and more generally quantum program comparison, to the noise profiles of target quantum devices.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.