Protein Counterfactuals via Diffusion-Guided Latent Optimization
arXiv cs.LG / 3/12/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper introduces MCCOP, a framework that computes minimal, biologically plausible sequence edits to flip a protein model's prediction to a desired target state.
- It operates in a continuous joint sequence-structure latent space and uses a pretrained diffusion model as a manifold prior to balance validity, proximity, and plausibility.
- MCCOP is evaluated on GFP fluorescence rescue, thermodynamic stability enhancement, and E3 ligase activity recovery, producing sparser and more plausible counterfactuals than discrete or continuous baselines.
- The recovered mutations align with known biophysical mechanisms, supporting interpretability and potential for hypothesis-driven protein design, with code publicly available on GitHub.




