ArXiv TLDR

On the Effectiveness of Context Compression for Repository-Level Tasks: An Empirical Investigation

🐦 Tweet
2604.13725

Jia Feng, Zhanyue Qin, Cuiyun Gao, Ruiqi Wang, Chaozheng Wang + 2 more

cs.SE

TLDR

This paper empirically investigates context compression for repository-level code tasks, finding it effective for performance and efficiency.

Key contributions

  • First systematic study of context compression for repository-level code intelligence.
  • Evaluates 8 methods across discrete, continuous, and visual token paradigms.
  • Continuous latent vector methods improve BLEU by 28.3% at 4x compression, filtering noise.
  • All compression paradigms reduce inference cost and latency by up to 50%.

Why it matters

LLMs struggle with long, noisy code contexts. This paper empirically validates context compression, improving performance and efficiency for repository-level tasks. It provides practical guidance for its application.

Original Abstract

Repository-level code intelligence tasks require large language models (LLMs) to process long, multi-file contexts. Such inputs introduce three challenges: crucial context can be obscured by noise, truncated due to limited windows, and increased inference latency. Context compression mitigates these risks by condensing inputs. While studied in NLP, its applicability to code tasks remains largely unexplored. We present the first systematic empirical study of context compression for repository-level code intelligence, organizing eight methods into three paradigms: discrete token sequences, continuous latent vectors, and visual tokens. We evaluate them on code completion and generation, measuring performance and efficiency. Results show context compression is effective: at 4x compression, continuous latent vector methods surpass full-context performance by up to 28.3% in BLEU score, indicating they filter noise rather than just truncating. On efficiency, all paradigms reduce inference cost. Both visual and text-based compression achieve up to 50% reduction in end-to-end latency at high ratios, approaching the cost of inference without repository context. These findings establish context compression as a viable approach and provide guidance for paradigm selection.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.