JumpLoRA: Sparse Adapters for Continual Learning in Large Language Models

April 17, 20262604.16171

Alexandra Dragomir, Ioana Pintilie, Antonio Barbalau, Marius Dragoi, Florin Brad + 4 more

cs.LGcs.AIcs.CL

TLDR

JumpLoRA introduces sparse adapters using JumpReLU gating for continual learning in LLMs, effectively preventing catastrophic forgetting and outperforming SOTA.

Key contributions

Proposes JumpLoRA, a novel framework for sparse LoRA adapters in LLMs using JumpReLU gating.
Achieves dynamic parameter isolation to effectively prevent catastrophic forgetting and task interference.
Highly modular, compatible with existing LoRA-based continual learning methods like IncLoRA.
Significantly boosts IncLoRA performance and outperforms state-of-the-art continual learning method ELLA.

Why it matters

Continual learning in LLMs is crucial but suffers from catastrophic forgetting. JumpLoRA offers an effective solution by dynamically isolating parameters. This method improves existing approaches and sets a new benchmark for performance.

Original Abstract

Adapter-based methods have become a cost-effective approach to continual learning (CL) for Large Language Models (LLMs), by sequentially learning a low-rank update matrix for each task. To mitigate catastrophic forgetting, state-of-the-art approaches impose constraints on new adapters with respect to the previous ones, by targeting either subspace or coordinate-wise interference. In this paper, we propose JumpLoRA, a novel framework to adaptively induce sparsity in the Low-Rank Adaptation (LoRA) blocks through the use of JumpReLU gating. The method achieves dynamic parameter isolation, which helps prevent task interference. We demonstrate that our method is highly modular and compatible with LoRA-based CL approaches. Specifically, it significantly boosts the performance of IncLoRA and outperforms the leading state-of-the-art CL method, ELLA.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers