A Kiwi author included in a billion-dollar US settlement over the illegal downloading of books to build an AI chatbot says she hopes the case will be a warning to the artificial intelligence industry.
But the Society of Authors said the authors of millions of titles used to build the tools will miss out on compensation as they are not registered for copyright in American territories.
Anthropic AI has agreed to pay out up to $1.5 billion (NZ$2.6 billion) to settle claims it used millions of pirated books to train its large language models, in a class action in Californian courts.
Victoria University lecturer Dr Stephen Skalicky has launched a new project to develop a deep analysis of irony and satire, which could help AI in the race to communicate in more human ways. (Source: Q and A)
Award-winning author Catherine Chidgey said she received an email saying her books Remote Sympathy, The Wishchild and The Transformation had been identified as being caught up in the case.
She said authors were being offered a payment of US$3000 (NZ$5240) for each title accessed by the company.
"On the one hand I'm grateful that they're being held to account. The works they accessed were from two websites that held pirated works so they weren't accessing them legally. It's not like they went out and paid for them.
"I imagine this will serve as a warning to others out there who are hoovering up intellectual property without asking. This has been going on for a while now it's just that [Anthropic] are the first company that's been held to account for it," Chidgey said.
Research shows almost 60% are at risk of losing their jobs to technology such as artificial intelligence. (Source: 1News)
She said the money offered to authors paled in comparison to the toil behind creating the works.
"It's not really enough - if you think about the years of effort I've put into writing those books - but I'm glad that there has been a line drawn in the sand," she said.
'A slap on the wrist'
RNZ artificial intelligence commentator Peter Griffin said the ruling did not counter the use of intellectual property for training AI models without permission.
He said instead the settlement had pivoted on Anthropic's use of two book pirating websites (Library Genesis and Pirate Library Mirror) to source the content.
"This boils down to basically a fine for using dodgy websites and - in the scheme of things - $1.5 bil is not a lot for a company like Anthropic. That's a slap on the wrist with a wet bus ticket.
Experts say the material is now spreading across mainstream social media, chat forums, and search platforms. (Source: 1News)
"Fundamentally, the things that [the AI companies] were terrified about was a ruling saying you cannot use this material under copyright law - this is not 'fair use'. That what's really scared them and that did not happen.
"Ultimately, it says to the AI companies, 'fill your boots, you can still use all of these texts to inform AI training models but if you're ever caught doing that through dodgy pirate websites - or peer-to-peer file sharing networks - you're going to face repercussions'," Griffin said.
He said companies recognised the need to move quickly to distinguish themselves in the rapidly evolving field.
"Many of these companies take the view - in the famous words of Mark Zuckerberg - 'move fast and break things'.

"They needed to get a model up quickly that would wow the world - that would be really useful - so they just hoovered up anything that they could get their hands on. Now there's this sort of rearguard action by content owners and publishers to try and scramble to retain some of that value," Griffin said.
However, Griffin said the need for AI tools to be continuously trained on new data could still "draw a line in the sand" for AI use of intellectual property in the future.
"It's going to be a very difficult process and I don't think publishers and authors will get anywhere near what they deserve to get but at least now there is a precedent that 'sure, it may be fair use but you have to legitimately obtain those copyrighted texts'.
"The only way they can really do that at scale is to strike a deal with the publishers so they can really get that done in a legitimate way," Griffin said.

Thousands of authors ineligible for compensation
New Zealand Society of Authors chief executive Jenny Nagle says while thousands of New Zealand titles may have been used by the company it was likely only dozens of local authors would be eligible for compensation.
"I have records of about 3500 New Zealand titles that formed part of that data set that was uploaded. However the settlement in the US courts decreed that only books registered with the US copyright office were eligible to be part of the settlement.
"This was appealed by lawyers from the Society of Authors from around the world because books from all countries, all copyright jurisdictions and languages, are part of this data set [but] it was denied by the US courts.
"My understanding is there was about 7.5 million books in this pirated library and about 1.5 million will be eligible for compensation through this settlement," Nagle said.
Nagle said the society was encouraging members to ensure their books were registered through the US copyright office for future claims.
"This is but the first settlement of scores of court cases that are in train for this issue. Nearly all of the larger language models have been trained by pirated libraries. They've been trained by copyright theft.
"If you are a producer of a product generally you pay for the ingredients of your product and in this case the AI developers have said 'fair use, we can scrape whatever we want and we don't have to pay' but this is people's property," Nagle said.
Nagle said the government needed to act quickly but the current review of the Copyright Act was unlikely to address issues around AI until its second stage in 2027.
"It is more urgent than that. The Australian government has just moved to say that data mining and scraping ingestion is not a fair use - or fair dealing copyright - and needs licensing and we would really look to the government to do something about this.
"This is a rapidly evolving space, of course, but really we do need legislation and regulation because the development is just the wild west at the moment," Nagle said.
Author Rose Carlyle, whose works The Girl in the Mirror and No One Will Know were published in the US, is included in the class action and due for a payment.
"They took from us what we slaved over for years," she told Morning Report. "It really guts me that that has just been taken and used to make so much money by billionaire companies. They could have bought the book but they chose not to."























SHARE ME