At the end of last year the Senate Select Committee on Adopting Artificial Intelligence handed down a set of recommendations, which, to the delight of Australian creative workers, states in no uncertain terms the need to uphold those workers’ rights against encroachment by the generative AI industry.
At present, generative AI – a powerful new technology that can automatically generate text, images, music, videos and more – operates in a haze of legal ambiguity. Generative AI systems automatically generate original material, but in order to do so they learn everything – structure, style, meaning, composition, harmony – from our very own canon of human creative effort.
On the basis that they don’t literally hand over this copyrighted training data to their users, many AI companies claim that they are not infringing on the copyright of those makers. It is fair game, they say, to train models on any data they can find: the systems are merely doing what people do when they listen to music, read literature or view visual art, and then come up with new creative works inspired by these experiences.
Copyright holders, and some competing AI companies who have taken efforts to source licensed or copyright-free training data, have been very quick to refute this perspective, calling it “theft on an unprecedented scale”.
Read: Beacon of hope as Senate releases final report to stop AI theft and remunerate creatives
Unfortunately, it’s proven tricky to say exactly what law is being broken. In some cases, such as New York Times versus OpenAI, it’s been possible to show that generated outputs are “substantially similar” to copyrighted assets. The New York Times was able to get ChatGPT to output lengthy sections of text identical to NYT articles. Even then, OpenAI tried to argue that these were contrived examples, unlikely to happen in everyday user scenarios.
In other cases, such as Sarah Anderson versus Stability.ai, and several voice and image cloning cases, claims are made that generative outputs falsely advertise, infringe trademarks or otherwise directly and knowingly misdirect revenue from artists. Such cases are ongoing with varying success, mostly frustration. But, ultimately, they get nowhere near the heart of the problem, which is less about specific references and more about the wholesale repackaging of cultural knowledge in an industrialised fire sale.
Much more slippery is the claim that copyrighted data shouldn’t be fed into AI algorithms at all, irrespective of what happens to it next. After all, AI is used for many different things. Should a company like Shazam, the software of which helps people know what music they’re hearing in public places, and thus drives attention to those artists, be denied fair use access to training data in the same way as should a company like Stability.ai or Australia’s $300 million Leonardo.ai?
US fair use rules are seemingly quite clear on this: where the use influences the market, taking business away from the copyright holder, it’s not fair. But with Trump 2.0’s shock and awe attack on US institutions and embrace of a tech oligarchy, such reassurances seem even less secure than before. Noting the real tension governments, even corrupt ones, face between protecting creators, serving users, and keeping up in the AI innovation race, there will be devils all over this detail.
Implementation details notwithstanding, if the recommendations of the Senate Select Committee are adopted by Parliament, Australian creators, publishers, artist bodies and their kin may breathe a small sigh of relief, notch another success in the roller-coaster ride of copyright, and perhaps even embrace the generative AI revolution. For now they can certainly revel in the clarity of the report in condemning as “farcical” and “hypocritical” submissions by Google, Amazon and Meta, which claimed that, in the report’s words, “the theft of Australian content is actually for the greater good” because it ensures such content is “represented” in their AI engines.
But, the drama is far from over. Legal protections are one thing, but emerging technological, cultural and economic realities are another. As with almost every area of life, AI has the potential to both exacerbate and lay bare existing inequalities, sometimes all at once, and the inequality between the mighty and minor players in the creative industries will be as prominent as ever. A challenge for AI companies paying for training data is that there’s no definitive interpretation of how valuable each data point is. The data all get fed into the generative model and thus an impenetrable black box is forged. What it generates cannot be traced back to those different sources in the way we can trace plays of tracks on a streaming service.
Thinking of data like water or electricity, AI companies could buy it by the megabyte, but given the amount of data needed, the cost per unit in a flat model would have to be beyond miniscule. This is important to understand: even the most happy-go-lucky creative amateur would see no value in being paid a handful of cents for the canon of their creative work to be used in an AI model for eternity, the result being the corporate modelling of their style and skill with no credit.
The alternative is to create differentials in value. An AI company can train a base model based on free data, such as out-of-copyright material, or material they’ve created in-house, then fine-tune that model with premium data that they can afford to pay a better price on. Depending on how the new protections play out, they could also conceivably find there are workarounds or nuanced fair use provisions allowing them to use copyrighted data only insofar as training doesn’t capture artist-specific traits.
In other words, we can think of all the world’s creative data as comprising the creative labour of artists – in music, say, creative work across songwriting, performance, production etc – but also generic things like rules of harmony, meter, the sounds of instruments, room acoustics, cable hiss and vinyl crackle. With a bit of legal creativity, companies may yet be able to find ways to train some aspects of large models without paying.
But a third path to differentiating value is to attempt to build an AI attribution machine – a system, either built into the generative AI technology itself, or applied retroactively to AI-generated works to understand what sources in the system’s training set could be deemed most influential to the specific style and content of any given output.
This is not fantasy technology. Industry giants have made it clear this is desirable. In the words of Geoff Taylor, Head of AI at Sony Music, “We are going to need to solve the question of … [what role certain inputs have] played in the development and generation of particular outputs so that we can compensate creators for the contribution they have made.”
And it’s happening. The company ProRata.ai, for example, has announced Gist.ai, a tool that reports, for any given generative output, a list of ‘contributors’ retrieved from the training set and sorted according to a given percentage contribution to the output.
This AI solution for an AI problem is seductive. As an academic fascinated by what AI can tell us about human creativity, I can see that such technology could further our understanding of culture, taste, talent and perception. But perhaps even more than generative AI it needs to be carefully watched and potentially regulated. While such systems could reveal new ideas about how cultural products are copied and influence each other, it would be disastrous to simply believe what it told us. Its criteria – entangled dimensions in a black box – will contain arbitrary interpretations just as ours do.
Worse is the fact that any such system would become the immediate focus of an arms race. Major actors would be free to run the numbers on AI-generated creative outputs, stack training sets and adjust training algorithms to suit financially preferable outcomes. They would not only be able to game the numbers, but in the long term this could potentially game the cultural dynamics themselves: corporate modelling of culture acting swiftly to capture and monetise artistic styles as they evolve, such as youth music movements spontaneously arising in underprivileged communities that serve essential communal cultural expression.
In other words, there is more at stake than money: the economics of art always has political ramifications. It is also a reminder that the creative industries, despite the recent unity shown in fighting AI theft, are not all in it together.
AI attribution algorithms would join recommender systems, those products that feed you the next song to listen to or next book to buy, in a class of cultural algorithms that are becoming increasingly influential in differentiating and directing cultural value. And we can imagine this algorithmic family spawning more ominous offspring: personalised generative media that predicts our tastes, or overdrives the emotional impact of nefarious memes.
But a positive take on such developments is that they also highlight and reveal how processes of value differentiation have always been present and potentially inequitable, as long as we’re paying attention. In the history of creative work, contractual minutiae can underlie huge shifts in who gets rewarded what. We can see this, for example, in how songwriting and sound recording royalties are divided in large agreements between major labels and streaming companies.
In fact, perhaps the most profound thing about such algorithms is that they would not only apply to AI-generated outputs, but potentially any dispute over the authorship of creative works. The famous case of the Blurred Lines ruling, where the Estate of Marvin Gaye successfully sued Robin Thicke and Pharrell Williams for $7 million for imitating the ‘vibe’ of Marvin Gaye’s song, set a precedent, although perhaps an outlier, in copyright law. There’s a real potential that the widespread adoption of AI attribution algorithms, even if not enshrined in law, could spawn cultural norms where creators are forced constantly to measure their difference from other works, navigating a dense minefield of cultural freedoms.
These are speculative outlines of a not-too-near future that will be pored over for years. So what is most urgent? The above risks demand a better and more widespread awareness of how machine learning represents knowledge and makes decisions about culture, style, difference and influence. Above all, we should never be content that simply because an AI system reports something as being an influence on something else, that does not make it so, but we should recognise that in the real feedback loops of culture interacting with technology, we can make the mistake of allowing the construction of a new political-economic reality without realising it.
AI’s evaluations are as arbitrary as ours, necessarily so, as there is no Platonic realm we can consult to check how cultural objects should be related to each other. Again, this awareness should centre on the truth that our own perceptions of difference, influence and similarity are not only arbitrary too, but also embedded in our lived politics. In short, we need to embrace this technology with a wisdom that has evaded the adoption of generative AI so far.
Independent groups in Australia could play a significant role in stewarding fair and responsible attribution, uniting teams as innovatively interdisciplinary as the teams of technologists, designers and creative industries experts being forged in the start-up sector. But, above all, these groups should be founded on an appreciation of what culture is, how it works, and what makes it valuable.
Read: AI erasure – how AI could reshape our understanding of history and identity
With appropriate support, such bodies could be technologically pioneering, not merely clinging on to the coat tails of AI innovators. Australia could examine sovereign creative foundation models, rented to the generative AI industry under mandated licences, in which fair attribution is determined outside of big stakeholder deals. Counter to the perception that public bodies are overly bureaucratised, they could also provide clarity to a baroque copyright system once described as if it were designed by Franz Kafka and built by Rube Goldberg (credited to Rob Glaser, Chairman MusicNet).
The last problem, however, is a doozy. Everything depends on the decisions of jurisdictions much bigger than Australia. The Select Committee response for now places Australia as a champion of creators’ rights not beholden to angst about our fortunes in the global AI innovation race. It was swiftly praised by many artists’ bodies such as APRA/AMCOS, which published its own report last August regarding the potential revenue hit from AI music.
The problem is, we have a relatively small AI industry, made predominantly of SMEs, and we’re not really positioning to be an AI leader. The US, Europe, the UK and China are. Where this leaves us in the long term is as hazy as the rest of the geopolitical questions 2025 has to offer.