Continual Alignment of Large Language Models with Entropic Gradient Refraction
Proposed EGR, a continual alignment strategy that treats the parameter space as a heterogeneous medium and the optimization path as a light ray propagating through it, where the medium's density is defined by the model's confidence on past tasks. Gradient updates are refracted away from previously learned structural knowledge, balancing stability and plasticity during LLM continual learning. Improved performance by 170% (BLEU-4), 5.4% (ROUGE-L), and 8.2% (LLM-as-a-Judge) over baselines.