Parallel-SFT: Improving Zero-Shot Cross-Programming-Language Transfer for Code RL
arXiv cs.CL / 4/23/2026
📰 NewsModels & Research
Key Points
- The paper proposes “zero-shot cross-programming-language transfer for code RL,” aiming to leverage the universality of coding skills across different programming languages when training data is limited for lower-resource languages.
- It finds that, for Llama-3.1, performing RL training on code generation in a source language can fail to improve—and sometimes degrades—performance on other target languages.
- To enable more effective RL transfer, the authors hypothesize that RL needs a more generalizable SFT (supervised fine-tuning) initialization before RL.
- They introduce “Parallel-SFT,” which mixes together functionally equivalent code implementations written in multiple programming languages during SFT, and show that this improves subsequent RL generalization to unseen languages.
- Internal representation analysis suggests Parallel-SFT produces a more functionality-centric latent space, clustering semantically equivalent programs across languages and thereby boosting transferability.
Related Articles

Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans
Dev.to

OpenAI Just Named It Workspace Agents. We Open-Sourced Our Lark Version Six Months Ago
Dev.to

GPT Image 2 Subject-Lock Editing: A Practical Guide to input_fidelity
Dev.to

GPT Image 2 vs DALL-E 3: What Actually Changed in OpenAI's New Image Model
Dev.to

AI Tutor for Science Students — Physics Chemistry Biology Solved by AI
Dev.to