AI Navigate

Bioalignment: Measuring and Improving LLM Disposition Toward Biological Systems for AI Safety

arXiv cs.CL / 3/11/2026

Ideas & Deep AnalysisModels & Research

Key Points

  • Researchers identified systematic biases in LLMs that favor synthetic over biological solutions across key domains such as materials, energy, manufacturing, and algorithms.
  • A novel Bioalignment benchmark and evaluation framework based on 50 curated prompts was created to measure LLM disposition toward biological problem-solving.
  • Fine-tuning smaller open-weight LLMs (Llama 3.2-3B-Instruct and Qwen2.5-3B-Instruct) on a corpus of biological problem-solving articles significantly increased preference for biological approaches without harming general capabilities.
  • The work demonstrates that small amounts of fine-tuning can effectively shift LLM biases and proposes extensibility of this approach to larger models for promoting bio-based AI solutions.
  • The authors have open-sourced their benchmark, corpus, code, and adapter weights to enable further research in bioalignment and AI safety.

Abstract

Large language models (LLMs) trained on internet-scale corpora can exhibit systematic biases that increase the probability of unwanted behavior. In this study, we examined potential biases towards synthetic vs. biological technological solutions across four domains (materials, energy, manufacturing, and algorithms). A sample of 5 frontier and 5 open-weight models were measured using 50 curated Bioalignment prompts with a Kelly criterion-inspired evaluation framework. According to this metric, most models were not bioaligned in that they exhibit biases in favor of synthetic (non-biological) solutions. We next examined if fine-tuning could increase the preferences of two open-weight models, Llama 3.2-3B-Instruct and Qwen2.5-3B-Instruct, for biological-based approaches. A curated corpus of ~22M tokens from 6,636 PMC articles emphasizing biological problem-solving was used first to fine-tune Llama 3B with a mixed corpus of continued training and instruction-formatted. This was then extended to Qwen 3B using instruction-formatted only. We found that QLoRA fine-tuning significantly increased the scoring of biological solutions for both models without degrading general capabilities (Holm-Bonferroni-corrected p < 0.001 and p < 0.01, respectively). This suggests that even a small amount of fine-tuning can change how models weigh the relative value of biological and bioinspired vs. synthetic approaches. Although this work focused on small open-weight LLMs, it may be extensible to much larger models and could be used to develop models that favor bio-based approaches. We release the benchmark, corpus, code, and adapter weights.