Cryptography & Security
Research on AI security, adversarial attacks, privacy, and cryptographic methods.
cs.CR · 505 papersSecurity Incentivization: An Empirical Study of how Micropayments Impact Code Security
This study shows that team-level incentives tied to automated security metrics significantly improve code security in development teams.
TextSeal: A Localized LLM Watermark for Provenance & Distillation Protection
TextSeal is a new LLM watermark using dual-key generation and multi-region localization for robust, distortion-free detection and distillation protection.
Attacks and Mitigations for Distributed Governance of Agentic AI under Byzantine Adversaries
This paper analyzes attacks on agentic AI governance from compromised centralized providers and proposes Byzantine-resilient, monitoring, and auditing solutions.
Reconstruction of Personally Identifiable Information from Supervised Finetuned Models
This paper reveals that PII can be reconstructed from supervised finetuned LLMs, proposing COVA to enhance reconstruction under prefix attacks.
No More, No Less: Task Alignment in Terminal Agents
A new benchmark, TAB, reveals terminal agents struggle with selectively following relevant instructions while ignoring distractors, highlighting a gap in task alignment.
ACTING: A Platform for Cyber Ranges Federation
ACTING is a platform that uses a new language (EDL-FG) for federated cyber ranges, enabling automated, multi-domain cyber defense training and evaluation.
PrivacySIM: Evaluating LLM Simulation of User Privacy Behavior
PrivacySIM evaluates LLMs' ability to simulate individual privacy decisions, finding persona conditioning improves accuracy but models still struggle.
The Deepfakes We Missed: We Built Detectors for a Threat That Didn't Arrive
Deepfake detection research is misaligned, focusing on public figure manipulation while real threats are NCII, voice scams, and emotional fraud.
SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces
SkillSafetyBench evaluates how reusable skills in LLM agents create new attack surfaces, revealing vulnerabilities beyond model-level alignment.
A microservices-based endpoint monitoring platform with predictive NLP models for real-time security and hate-speech risk alerting
A microservices platform uses predictive NLP to provide real-time security and hate-speech risk alerts from endpoint data, unifying monitoring and analytics.
AccLock: Unlocking Identity with Heartbeat Using In-Ear Accelerometers
AccLock passively authenticates users via unique in-ear heartbeat signals captured by accelerometers, overcoming limitations of prior systems.
Proteus: A Self-Evolving Red Team for Agent Skill Ecosystems
Proteus is a self-evolving red-team framework that uncovers adaptive leakage in LLM agent skills, showing current vetting underestimates risk.
IPI-proxy: An Intercepting Proxy for Red-Teaming Web-Browsing AI Agents Against Indirect Prompt Injection
IPI-proxy is an intercepting proxy for red-teaming web-browsing AI agents against indirect prompt injection by rewriting whitelisted HTTP responses.
Five Attacks on x402 Agentic Payment Protocol
This paper identifies five practical attacks on the x402 agentic payment protocol, revealing critical vulnerabilities in its design and implementation.
Behavioral Integrity Verification for AI Agent Skills
This paper introduces Behavioral Integrity Verification (BIV) to audit AI agent skills, finding widespread deviations and improving malicious skill detection.
Persona-Conditioned Adversarial Prompting: Multi-Identity Red-Teaming for Adversarial Discovery and Mitigation
PCAP uses diverse personas for red-teaming LLMs, significantly boosting attack success and generating robust defense data for improved safety.
HySecTwin: A Knowledge-Driven Digital Twin Framework Augmented with Hybrid Reasoning for Cyber-Physical Systems
HySecTwin is a knowledge-driven digital twin framework using hybrid reasoning for real-time, interpretable cybersecurity threat detection in Cyber-Physical Systems.
Cochise: A Reference Harness for Autonomous Penetration Testing
Cochise is a minimal Python reference harness for LLM-driven autonomous penetration testing, providing reusable infrastructure for research and comparison.
Options, Not Clicks: Lattice Refinement for Consent-Driven MCP Authorization
Conleash is a client-side middleware that uses a risk lattice and policy engine to provide consent-driven, boundary-scoped authorization for MCP tool invocations.
Natural Language based Specification and Verification
This paper explores using LLMs to generate and verify code implementations based on natural language specifications, showing promising preliminary results.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.