mcdok at SemEval-2026 Task 13: Finetuning LLMs for Detection of Machine-Generated Code
arXiv cs.LG / 4/24/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- SemEval-2026 Task 13 focuses on detecting machine-generated code across multiple programming languages, including both binary detection and source attribution.
- The task includes specialized subtasks such as detecting which generator LLM family produced the code, identifying code co-generated by humans and machines, and spotting adversarial edits meant to hide provenance.
- The authors adapted the existing mdok approach from machine-generated text detection to machine-generated code by testing multiple base models better suited for code understanding.
- Their submitted systems performed competitively on all three subtasks, but the gaps versus the top teams suggest there is still meaningful room for further performance improvements.
Related Articles

The 67th Attempt: When Your "Knowledge Management" System Becomes a Self-Fulfilling Prophecy of Excellence
Dev.to

Context Engineering for Developers: A Practical Guide (2026)
Dev.to

GPT-5.5 is here. So is DeepSeek V4. And honestly, I am tired of version numbers.
Dev.to

I Built an AI Image Workflow with GPT Image 2.0 (+ Fixing Its Biggest Flaw)
Dev.to
Max-and-Omnis/Nemotron-3-Super-64B-A12B-Math-REAP-GGUF
Reddit r/LocalLLaMA