As AI tools become deeply embedded in how we write, code, and think, universities are grappling with a deceptively hard question: how do you regulate something that's invisible by design?
The Institute of Statistics at LMU Munich recently published guidelines for AI tool usage in academic work that I think strike a remarkably pragmatic balance. Rather than banning AI or pretending it doesn't exist, they treat it as what it is — a tool that needs the same transparency we already expect for other tools and sources. I want to walk through the key ideas here, because I believe they're relevant well beyond academia.
The Core Philosophy: Responsibility and Transparency
The guidelines rest on two pillars that are hard to argue with.
Responsibility. Students bear full responsibility for every word they submit, regardless of which tools helped produce it. If you can't explain it, you shouldn't submit it. This applies to prose and program code equally.
Transparency. AI tool usage must be documented. This isn't some new bureaucratic burden — it's the natural extension of existing academic practice. We already cite sources, disclose co-author contributions, and list our tools. AI assistance is just the next entry in that list.
What Documentation Actually Looks Like
Every academic work must include a dedicated "AI Tools" section of roughly 0.25–1 page. Think of it as the "author contributions" statement you'd find in a multi-author dissertation, but for your AI interactions. The section should cover three things:
- Which AI tools were used and how they were generally applied
- Which sections of the work involved AI assistance
- How extensively AI-generated content was revised
Additionally, when an AI tool contributed an essential line of thought — not just phrasing, but actual reasoning — that should be flagged via footnotes, similar to how you'd cite a human source.
The Appropriate vs. Inappropriate Divide
This is where it gets interesting. The guidelines don't draw a binary "AI allowed / AI forbidden" line. Instead, they define a spectrum.
Generally appropriate:
- Linguistic and stylistic corrections
- Translation assistance
- Topic exploration and literature discovery (AI as a tutor)
- Structuring support
- Verifying algebraic transformations or integral solutions
- Programming support
- Creating graphics and diagrams
Generally inappropriate:
- Direct adoption of AI-generated text or code without genuine understanding — especially for content-central parts like core theoretical sections, literature reviews, or key algorithm implementations
- Using AI for the main data analysis or interpretation
- Any undocumented AI usage
The nuance matters here. Using an LLM to scaffold your code and then refactoring it with understanding? Fine. Pasting in a prompt, copying out the result, and submitting it without being able to explain what it does? That's the line.
And there's a deliberate escape valve: students are encouraged to discuss the boundaries with their supervisors, because what counts as "appropriate" depends heavily on context.
Assessment: Quality Over Origin
One of the more refreshing aspects of these guidelines is the explicit statement that documented AI usage, in accordance with the rules, does not lead to a grade penalty. The quality of the work remains the primary criterion.
That said, there's a balancing mechanism: the oral defense or examination carries increased weight. You need to be able to explain and defend every aspect of your work in detail. This elegantly solves the verification problem — if you can't walk through your own code or reasoning in person, the documentation won't save you.
If AI usage goes undocumented, consequences range from grade reductions to formal deception charges.
Practical Documentation Examples
The guidelines include several example statements that are worth reading for the tone they set. They all follow a consistent pattern: "I wrote all parts of this work myself. Additionally, I used [tool] for [purpose]. The output was [reviewed/adapted/not adopted verbatim]."
A few paraphrased examples of what good documentation looks like:
- "I used ChatGPT to refine individual sentences in the introduction. Suggestions were manually reviewed, adapted, and reformulated — not adopted verbatim."
- "I used GitHub Copilot to check and optimize my programming scripts for XY. Suggestions for improving runtime and memory usage were comprehended, tested, and adopted where appropriate. All adoptions are documented in code comments."
- "I used Claude Sonnet 4 with web search to discover relevant sources and have theoretical concepts explained. Based on this, I consulted the primary literature directly and created my own summaries."
Notice the pattern: specific tool, specific use case, specific description of how the output was handled. No vague hand-waving.
A Taxonomy Worth Stealing
The guidelines also include detailed taxonomies for different categories of AI usage. I think these are useful reference material for anyone documenting AI-assisted work, not just students.
In writing: grammar checking, citation management, plagiarism detection, formatting, style improvement, paraphrasing, translation, literature review drafting, source summarization, content expansion, section composition, and simulated peer review.
In programming: inline completion, prompt-to-code generation, project scaffolding, code explanation, documentation generation, debugging, test generation, refactoring, performance optimization, API usage examples, dependency management, AI pair programming, and simulated code review.
In mathematics: symbolic manipulation, step-by-step tutorials, visualization, formula translation, conjecture generation, proof sketch drafting, formal proof synthesis, counterexample search, proof verification, and auto-formalization of natural language into formal logic.
These categories aren't exhaustive, and they're explicitly not all automatically "appropriate" — they're a vocabulary for describing what you did.
Why This Matters Beyond Academia
I spend a lot of my time building evaluation and observability infrastructure for LLM workflows. One thing that's become clear is that the documentation problem isn't unique to universities. Any team using LLMs in production faces the same question: how do we track what the model contributed vs. what a human decided?
The LMU approach — structured documentation, clear responsibility, quality-first assessment, and verification through explanation — maps surprisingly well onto engineering practices. Swap "oral defense" for "code review where you explain your PR," and "AI Tools section" for "commit messages and PR descriptions that disclose AI assistance," and you're most of the way there.
The key insight is that transparency isn't about restricting AI usage. It's about maintaining accountability. And that's a principle that scales from a bachelor's thesis to a production ML pipeline.
The full original guidelines (in English) are available from the Institute of Statistics at LMU Munich. If your institution is working on similar policies, they're worth reading in full — they're one of the more thoughtful takes I've seen on this topic.
