Neural architectures for resolving references in program code
Gergő Szalay, Gergely Zsolt Kovács, Sándor Teleki, Balázs Pintér, Tibor Gregorics
TLDR
New neural architectures effectively resolve code references, outperforming existing models by 10x in length and reducing decompilation errors by 42%.
Key contributions
- Abstracts reference rewriting into direct/indirect indexing by permutation.
- Introduces novel sequence-to-sequence architectures for these indexing tasks.
- Outperforms baselines in robustness and scalability, handling 10x longer examples.
- Reduces error rate by 42% in real-world switch statement decompilation.
Why it matters
This paper addresses a fundamental challenge in programming languages: reference resolution. Existing ML models struggle with this, but the new architectures offer significant improvements in robustness and scalability. This advancement is crucial for tools like decompilers, making them more accurate and capable.
Original Abstract
Resolving and rewriting references is fundamental in programming languages. Motivated by a real-world decompilation task, we abstract reference rewriting into the problems of direct and indirect indexing by permutation. We create synthetic benchmarks for these tasks and show that well-known sequence-to-sequence machine learning architectures are struggling on these benchmarks. We introduce new sequence-to-sequence architectures for both problems. Our measurements show that our architectures outperform the baselines in both robustness and scalability: our models can handle examples that are ten times longer compared to the best baseline. We measure the impact of our architecture in the real-world task of decompiling switch statements, which has an indexing subtask. According to our measurements, the extended model decreases the error rate by 42%. Multiple ablation studies show that all components of our architectures are essential.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.