SkillDroid: Compile Once, Reuse Forever
Qijia Chen, Andrea Bellucci, Zhida Sun, Giulio Jacucci
TLDR
SkillDroid compiles LLM-guided GUI trajectories into reusable skills, boosting success rates and efficiency by replaying tasks without constant LLM calls.
Key contributions
- Compiles LLM-guided GUI trajectories into reusable, parameterized skill templates.
- Replays stored skills without LLM calls, using a matching cascade for instruction routing.
- Features a failure-learning layer that triggers recompilation when skill reliability degrades.
- Achieves 85.3% success (23% above baseline) and 49% fewer LLM calls, improving over time.
Why it matters
Current LLM agents are inefficient and stateless, re-deriving tasks repeatedly. SkillDroid addresses this by introducing a learning and reuse mechanism. This significantly improves reliability, speed, and resource efficiency for mobile GUI automation, making LLM agents practical for real-world use.
Original Abstract
LLM-based mobile GUI agents treat every task invocation as an independent reasoning episode, requiring a full LLM inference call at each action step. This per-step dependence makes them stateless: a task completed successfully yesterday is re-derived from scratch today, with no improvement in reliability or speed. We present SkillDroid, a three-layer skill agent that compiles successful LLM-guided GUI trajectories into parameterized skill templates (sequences of UI actions with weighted element locators and typed parameter slots) and replays them on future invocations without any LLM calls. A matching cascade (regex patterns, embedding similarity, and app filtering) routes incoming instructions to stored skills, while a failure-learning layer triggers recompilation when skill reliability degrades. Over a 150-round longitudinal evaluation with systematic instruction variation and controlled perturbations, SkillDroid achieves an 85.3% success rate (23 percentage points above a stateless LLM baseline) while using 49% fewer LLM calls. The skill replay mechanism achieves a perfect 1000% success rate across 79 replay rounds at 2.4 times the speed of full LLM execution. Most critically, the system improves with use: its success rate converges upward from 87% to 91%, while the baseline degrades from 80% to 44%.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.