Data Poisoning Tools – How Artists Are Using Data Poisoning Tools To Curb Data Scraping By AI

Presently, copyright issues pertaining to AI generated content and training data are still in a grey area due to lack of legal and regulatory developments.

Naik Naik & Co - Romika Chawla, Amartya Mody, Parinika Krishnan

Published on:

21 Mar 2024, 8:52 am

3 min read

In a recent development in the notorious dogfight between artificial intelligence (AI) and artists in the realm of art and technology, artists have decided to fight fire with fire and have landed a critical hit on AI. By utilising data poisoning tools, artists have started poisoning the training data which the AI relies on, and the result is a confused and frantic AI which is producing truly bizarre results to prompts, wishing it had immunity to poison akin to Mithridates VI Eupator (the ruler of Kingdom of Pontus) also known as Poison King.

Historically speaking, this shady tactic would put honourable warriors to shame, however when the fight is against super intelligent, ever adapting AI tools, it is advisable to ignore all moral and ethical concerns and fight dirty. The way it works is, the data poisoning tool attacks and manipulates training data by introducing unexpected behaviours into machine learning models at training time. Imagine this: you are a gifted artist, flaunting your latest masterpiece on the internet. Suddenly, some AI algorithm jumps in and replicates your work inch by inch, pixel by pixel, as if to tell you that it can do your job better than you. Naturally, insecurities creep in, as is quite common among artists, and this gave birth to data poisoning tools. By sprinkling just enough mischief (poison) into the artwork, the artist sets the sun on AI who comes in like a predator, little does it know that it is the one getting hunted. The result? Instead of a faithful reproduction, the AI spits out a psychedelic mess that would make even Salvador Dalí do a double-take.

A team of computer scientists from University of Chicago led by Ben Zhao, with an intent to tip the balance in favour of artists and to disprove the hypothesis that state of the art text-to-image diffusion models like DALL-E are immune to poison attacks, created a data poisoning tool called ‘Nightshade’. Essentially, the data poisoning tools makes image data redundant for AI models who scrape such data for training by making invisible changes to the pixels of the digital art which confuses the AI into reproducing inaccurate results. Alas, at what cost! We have reached a point where artists are defiling their own work in hopes of escaping the claws of AI and to make good their economic right to monetise their creation. This begs the question, are artists violating their own moral rights by undertaking such mutilation? A very absurd thought one might say! Au contraire, if tomorrow, the governments or courts give AI companies a free pass to scrape data under the guise of ‘fair use’, will artists by virtue of using data poisoning tools to restrict access to their copyrighted work, be violating the right of such companies? A very bone chilling thought indeed.

Let’s keep the legal jargons aside for a minute and take a moment to appreciate the development and popularity of data poisoning tools which are being considered the equivalent of the Manhattan Project! Since it is technically a copyright tool which allows artists to protect their copyright by nuking the AI to smithereens. To utilise this nuclear bomb, all an acrimonious artist needs to do is upload their work on Nightshade so that the poison pill can be ingested by the work. Once the pill has been ingested, all one must do is wait for the AI on a feeding frenzy to scrape the data and then you are all set! For instance, if one day, Vincent Van Gogh (God rest his soul) decided to upload his works into a data poisoning tool, it will convert his post-impressionist style into street art or maybe anime! Funnily enough, Ben Zhao has remarked in his paper ‘Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models’ that the use of the tool should be considered as a last resort. As one might agree, data poisoning tools are indeed a last resort for artists, however if an artist does commit to poisoning its brainchild, what are the ramifications one is contemplating for such activities? With great power comes great responsibility, and in this context, responsibility is towards not marring the laws of copyright including moral rights and fair use doctrine.

Presently, copyright issues pertaining to AI generated content and training data are still in a grey area due to lack of legal and regulatory developments. However, courts in US, UK, China and EU have already started hearing and assessing lawsuits filed by artists, outcome of which is awaited with bated breath. Until such time, artists will continue to be forced into taking matters into their own hands and compelled to defile their own creations in the hopes of sparing their works from the glaring eyes and voracious appetite of text-to-image generative models such as Stable Diffusion, Midjourney, and DALL-E.

About the authors: Romika Chawla is an Associate Partner, and Amartya Mody & Parinika Krishnan are Trainee Associates at Naik Naik & Co.

Naik Naik & Company

Generative AI

Data Poisoning Tools

Nightshade