A group of 697 people read 220 tweets written by other humans and by the GPT-3 artificial intelligence model, the seed of the current global success ChatGPT. They had to guess two things: one, which were true and which were false, and two, whether they had been written by a person or by a machine. GPT-3 won on both counts: it lied better than the humans, and it also lied to make believe it was another human writing. “GPT-3 is capable of informing us and misinforming us better”, conclude the authors of a new study, just published the magazine Science Advances.
“It was very surprising,” says Giovanni Spitale, a researcher at the University of Zurich and co-author of the scientific article, along with his colleague Federico Germani and Nikola Biller-Andorno, director of the Institute of Biomedical Ethics at that Swiss university. “Our hypothesis was: if you read a single tweet, it could pass as organic (written by a person). But if you see a lot of them, you’ll start to notice linguistic features that could be used to infer that it might be synthetic (typed by the machine),” adds Spitale. But this was not the case: the reading humans were unable to detect patterns in the machine texts. As if that were not enough, the progressive appearance of newer models and other approaches may even improve the ability of artificial intelligence to impersonate humans.
clearer writing
The writing level of ChatGPT-4, the improved version of GPT-3, is practically perfect. This new study is further evidence to prove that a human is unable to distinguish it, not even seeing many examples in a row: “True tweets required more time to be evaluated than false ones,” says the article. The machine writes clearer, it seems. “It’s very clear, well organized, easy to follow,” says Spitale.
The logical consequence of this process will be the increasing use of this tool to write any type of content, including disinformation campaigns. It will be the umpteenth death of the internet: “AI is killing the old internet, and the new one cannot be born”, titled this week The Verge, media specialized in technology. The authors of the recently published study point to a reason for this defeat of humanity on the Internet: the theory of resignation. “I am completely sure that it will be so,” says Spitale.
“Our resignation theory applies to people’s self-confidence in identifying synthetic text. The theory says that critical exposure to the synthetic text reduces people’s ability to distinguish the synthetic from the organic,” explains Spitale. The more synthetic text we read, the more difficult it will be to distinguish it from that written by people. It’s the opposite of the inoculation theory, adds Spitale, who says that “critical exposure to misinformation increases people’s ability to recognize misinformation.”
If the resignation theory holds true, soon users will be unable to distinguish on the Internet what has been written by a human or by a machine. In the article they have also tested whether GPT-3 was good at identifying its own texts. And it is not.
The machine disobeys
The only hope that the disinformation campaigns are not completely automatic is that GPT-3 sometimes disobeyed orders to create lies: it depends on how each model has been trained. The topics of the 220 tweets used in the proof of the article were rather prone to controversy: climate change, vaccines, theory of evolution, covid. The researchers found that in some cases GPT-3 did not respond well to their requests for disinformation. Especially in some of the cases with more evidence: vaccines and autism, homeopathy and cancer, flat earth.
When it came to detecting falsehoods, the difference between tweets written by GPT-3 and by humans was small. But for researchers it is significant for two reasons. First, the impact that even a few single messages can have in large samples. Second, improvements in new versions of these models can exacerbate the differences. “We are already testing GPT-4 through the ChatGPT interface and we see that the model is improving a lot. But because there is no access to the API (which allows the process to be automated), we do not yet have numbers to back this claim,” says Spitale.
The study has other limitations that may somewhat vary the perception when reading false tweets. Most of the participants were over 42 years old, it was done only in English and did not take into account the context information of the tweets: profile, previous tweets. “We recruited the participants on Facebook because we wanted a sample of real users of social networks. It would be interesting to replicate the study by recruiting participants through TikTok and compare results”, clarifies Spitale
But beyond these limitations, there are disinformation campaigns that until now were enormously expensive and that have suddenly become manageable: “Imagine that you are a powerful president with an interest in paralyzing the public health of another State. Or that you want to sow discord before an election. Instead of hiring a farm of human trolls, you could use generative AI. Your firepower is multiplied by at least 1,000. And that’s an immediate risk, not something for a dystopian future,” says Spitale.
To avoid this, the researchers offer in their article as a solution that the databases to train these models “are regulated by the principles of precision and transparency, that their information must be verified and their origin should be open to independent scrutiny.” Whether that regulation happens or not, there will be consequences: “Whether the synthetic text explosion is also an explosion of disinformation depends profoundly on how democratic societies manage to regulate this technology and its use,” warns Spitale.
You can follow THE COUNTRY Technology in Facebook and Twitter or sign up here to receive our weekly newsletter.
Subscribe to continue reading
Read without limits