Skip to main content

AI and Copyright: Seeking a Fair Balance for Creators and AI Models

The rise of generative AI raises fundamental questions about privacy, ethics, and, perhaps most importantly for the cultural sector: copyright. While some embrace the technology, others see it as an existential threat. The global debate is narrowing down to two fronts: employment and the training of models using creators' work.

In this article, we focus on the latter: the training of AI. Is it the ‘greatest art heist of the century,’ or simply legally permissible?

6 minutes1 apr `26

Training or Stealing?

In a recent opinion piece in NRC, Birgit Donker called the training of AI models on others' work a ‘great art heist.’ She argues that the Dutch government encourages AI use but turns a blind eye to copyright infringements by tech companies.

However, legally, the situation is more nuanced. In the Netherlands and the EU, training AI on copyrighted material is not, in principle, considered theft. According to EU copyright legislation (around text and data mining) and the new European AI Act, training is allowed, provided that:

  1. The sources are lawfully accessible, and
  2. No opt-out has been applied.

Only when companies use illegal sources — as Meta did with the illegal book site LibGen — does it legally constitute theft.

The ‘Value Gap’: Who Pays the Price?

But just because something is legal doesn’t mean it feels fair. A so-called value gap arises: AI companies earn billions thanks to creators’ data, while those creators receive no compensation. How do we solve this? Currently, we see three movements: lawsuits, opt-outs, and licenses.

  1. Lawsuits: The Global Map of AI Law

    American intellectual property law professor Edward Lee recently mapped the proliferation of lawsuits. In the U.S., there are already nearly a hundred lawsuits underway, and the number is also increasing in Europe. Since final rulings take a long time, it is currently difficult to predict how the judiciary will rule on infringement during AI training.

  2. Opt-out: A Machine-Readable ‘No’

    Creators who do not want their work to end up in an AI model can set a rights reservation. This must be done in a machine-readable way (e.g., via a robots.txt protocol or the terms and conditions on a website). In addition to individual actions, collectives are emerging, such as the Dutch Opt Out Collective (part of the Federation of Visual Rights). They have already collected 100,000 opt-outs from creators and organizations like Pictoright.

  3. Licenses: The New Standard?

    We are seeing a cautious shift toward licensing deals. OpenAI has already made deals with news giants like The Associated Press, The Guardian, and Le Monde, and recently even with Disney.

    Notable is the deal between Universal Music Group (UMG) and AI platforms like Udio and Nvidia AI. UMG initially sued Udio for infringement but now opts for a ‘strategic partnership.’ They aim to create an ecosystem where control and transparency are central.

    The downside of licensing deals: experts see this as an opportunity for new frameworks, but there are drawbacks: 

    - Power imbalance: these are often deals between tech giants and media conglomerates. Individual creators and smaller players are left out. 

    - Lack of transparency: the contents of the deals remain secret. How much the actual creator benefits from this is unknown.

Alternative Models: Levies and APIs

Fairer systems are being explored. Professor Martin Senftleben from the University of Amsterdam advocates for a compensation model. In this model, AI companies pay a levy that is distributed directly to creators via collective management organizations (such as Pictoright, Lira, or Buma/Stemra). This could be through a statutory license during the training phase or a fee once the AI system hits the market.

Another model comes from the Wikimedia Foundation. They have established API access deals with companies like Microsoft, Meta, and Amazon. These are not licenses for the content but paid gateways to the data. This way, Wikipedia’s CC licenses remain intact, while tech companies contribute to maintaining the free information source.

The Dutch Path: GPT-NL

Finally, responsible alternatives are being developed. With GPT-NL, the Netherlands is getting its own language model that is reliable, transparent, and sovereign. The big difference? For GPT-NL's training, only data for which a valid license has been issued or for which no license is required is used. A model built on respect for the creator, rather than on the ‘art heist.’

Share this news article