Learn-to-learn on Arbitrary Textual Conditioning: A Hypernetwork-Driven Meta-Gated LLM

arXiv cs.CL / 5/5/2026

📰 NewsDeveloper Stack & InfrastructureModels & Research

Key Points

  • The paper proposes a “meta-gated” LLM that uses a hypernetwork to dynamically control the SwiGLU nonlinearity parameter (β) based on arbitrary textual conditions.
  • It aims to address limitations of conventional LLM conditioning, where corpus heterogeneity and small condition shifts can degrade performance.
  • By incorporating meta-learning signals into SwiGLU blocks, the method seeks adaptive behavior without the complexity and scalability issues often associated with standard meta-learning applied to LLMs.
  • Experiments across multiple condition types (task, domain, persona, style) show the approach outperforms fine-tuning and meta-learning baselines.
  • The authors report reasonable generalization to unseen tasks, condition types, or instructions and provide code at the linked GitHub repository.

Abstract

Conventional LLMs may suffer from corpus heterogeneity and subtle condition changes. While finetuning can create the catastrophe forgetting issue, application of meta-learning on LLMs is also limited due to its complexity and scalability. In this paper, we activate the meta-signal of \beta within the SwiGLU blocks, resulting in a meta-gating mechanism that adaptively adjusts the nonlinearity of FFN. A hypernetwork is employed which dynamically produces \beta on textual conditions, providing meta-controllability on LLMs. By testing on different condition types such as task, domain, persona, and style, our method outperforms finetuning and meta-learning baselines, and can generalize reasonably on unseen tasks, condition types, or instructions. Our code can be found in https://github.com/AaronJi/MeGan.