On the Security of Research Artifacts
TLDR
This paper reveals that many research artifacts contain security vulnerabilities, proposing a framework (SAFE) to assess and mitigate these risks.
Key contributions
- Analyzed 509 research artifacts, finding widespread insecure code patterns and potential attack vectors.
- Proposed a taxonomy for context-aware security assessment of research artifacts.
- Introduced SAFE, an autonomous framework for security-aware artifact evaluation, achieving 84.8% accuracy.
- Showed 41.6% of prevalent findings in artifacts pose security concerns under practical usage.
Why it matters
Research artifacts, widely shared for reproducibility, often overlook security, creating potential misuse. This paper highlights these critical vulnerabilities and offers a novel framework, SAFE, to integrate security assessment into artifact evaluation. This promotes safer and more responsible sharing of research.
Original Abstract
Research artifacts are widely shared to support reproducibility, and artifact evaluation (AE) has become common at many leading conferences. However, AE mainly checks whether artifacts work as claimed and can be reproduced. It largely overlooks potential security risks. Since these artifacts are publicly released and reused, they may unintentionally create opportunities for misuse and raise concerns about safe and responsible sharing. We study 509 research artifacts from top-tier security venues and find that many contain insecure code patterns that may introduce potential attack vectors. We propose a taxonomy for context-aware security assessment to enable structured analysis of such risks. We perform static analysis and examine the resulting findings, filtering false positives and identifying real security risks. Our analysis shows that 41.60% of the prevalent findings may pose security concerns under practical usage. To support scalable analysis, we introduce SAFE (Security-Aware Framework for Artifact Evaluation), a first step toward an autonomous framework that analyzes tool-reported findings by considering code semantics, execution context, and practical exploitability. SAFE achieves 84.80% accuracy and 84.63% F1-score in distinguishing security and non-security risks. Overall, our results show that security is also important in AE for promoting safe and responsible research sharing. The source code is available at: https://github.com/nanda-rani/SAFE
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.