Efficient Compositional Multi-tasking for On-device Large Language Models
arXiv cs.CL / 3/13/2026
💬 OpinionDeveloper Stack & InfrastructureIdeas & Deep AnalysisTools & Practical UsageModels & Research
Key Points
- The paper investigates on-device LLMs handling compositional multi-tasking, where a single input requires simultaneous execution of multiple tasks (e.g., translation plus summarization).
- It introduces a four-task compositional benchmark to evaluate on-device multi-tasking performance.
- It presents Learnable Calibration as a resource-efficient method designed for low-compute devices to achieve strong performance.
- The work establishes a foundation for expanding LLM capabilities in real-world, resource-constrained, multi-task scenarios.



