Causal Drawbridges: Characterizing Gradient Blocking of Syntactic Islands in Transformer LMs
arXiv cs.CL / 4/16/2026
💬 OpinionIdeas & Deep AnalysisModels & Research
Key Points
- The paper uses causal interventions in Transformer language models to study how syntactic “islands” block extraction, showing that models reproduce human acceptability judgments that vary by lexical content.
- It finds that extraction from coordination islands uses the same filler-gap mechanisms as canonical wh-dependencies, but those mechanisms are selectively inhibited to different degrees.
- By isolating functionally relevant subspaces across Transformer blocks, attention modules, and MLPs, the authors provide mechanistic evidence linking representational structure to syntactic constraints.
- A projection of a large unrelated-text corpus onto the causal subspaces leads to a new hypothesis that the conjunction “and” is encoded differently in extractable versus non-extractable constructions (relational vs. purely conjunctive uses).
- Overall, the work demonstrates how mechanistic interpretability techniques can generate testable linguistic hypotheses about representation and processing in Transformer LMs.
Related Articles

Introducing Claude Opus 4.7
Anthropic News

Who Audits the Auditors? Building an LLM-as-a-Judge for Agentic Reliability
Dev.to

"Enterprise AI Cost Optimization: How Companies Are Cutting AI Infrastructure Sp
Dev.to

Config-first code generator to replace repetitive AI boilerplate — looking for feedback and collaborators
Dev.to

The US Government Fired 40% of an Agency, Then Asked AI to Do Their Jobs
Dev.to