Introduction: Where AI and Copyright Tend to Be Murky
As generative AI has become mainstream, questions like “Whose copyrighted work is this image (or text)?” and “Is the data used for training okay?” grow more tangled. In short, AI and copyright are not a black-and-white, one-shot decision; judgments often vary based on purpose, means, and the degree of human involvement.
This article focuses on two frequently asked questions: 1) rights in generated outputs and 2) legal issues around training data, summarizing practical perspectives and countermeasures. Note: Legal interpretation varies by country/region, and final judgments depend on the specifics of the case. For important matters, consult lawyers or other professionals.
1. The Basics of Copyright (Key Points)
Copyright is roughly a rule to protect works of creative expression (texts, images, music, videos, programs, etc.). It protects expression, not ideas themselves.
- Works of authorship: Creative expression
- Author: Generally a person (natural person). Corporate authorship is an exception
- Rights: Reproduction, adaptation (derivative works), public transmission, etc.
When AI is involved, what matters is where human creative involvement lies and how closely the output resembles existing works.
2. Rights in Generated Outputs: Whose Property Is It? Can It Be Used?
2-1. Can AI-generated outputs be copyrighted? If so, how?
Generally, when human creative involvement is lacking (e.g., taking the exact result produced by a single click), copyrightability may be hard to recognize. Conversely, if humans control the expression through prompts, editing, composition design, iterative testing, and post-processing, there is room for copyright protection.
Practical guideline: It’s not enough to simply “write the prompt”; you should be able to say you deliberately experimented, made selections, and edited to shape the final expression.
2-2. Even if you can call it your own work, that does not guarantee you won’t infringe others’ rights
This is the trickiest point. Even if you may have copyright in the output, if the generated work is “too close” to a particular existing work, it may infringe someone else’s rights (reproduction, adaptation).
A common framework to assess this is “derivation” (whether the work relies on the original) and “similarity” (whether the expression is similar). In generated AI cases, training data and prompts complicate the evaluation, and outcomes vary by case, but it’s safer to assume that similarity could make things go wrong in practice.
2-3. Typical NG patterns: common cases that tend to be problematic
- Generation that mirrors a specific work’s unique expressions (composition, character designs, dialog style, etc. clearly matching)




