Most people couldn’t distinguish ChatGPT from a human responder, suggesting the famous Turing test has been passed for the first time.
We are interacting with artificial intelligence (AI) online not only more than ever — but more than we realize — so researchers asked people to converse with four agents, including one human and three different kinds of AI models, to see whether they could tell the difference. Source: LiveScience.com
In a groundbreaking development, researchers claim that GPT-4, the AI that powers ChatGPT, has passed the Turing test for the first time. This suggests that most people couldn’t distinguish ChatGPT from a human responder.
The Turing Test
The Turing test, first proposed as “the imitation game” by computer scientist Alan Turing in 1950, judges whether a machine’s ability to show intelligence is indistinguishable from a human. For a machine to pass the Turing test, it must be able to converse with somebody and fool them into thinking it is human.
The Experiment
Scientists replicated this test by asking 500 people to converse with four respondents, including a human, the 1960s-era AI program ELIZA, and both GPT-3.5 and GPT-4. After five minutes of conversation, participants had to say whether they believed they were talking to a human or an AI. The study found that participants judged GPT-4 to be human 54% of the time.
The Results
ELIZA, a system pre-programmed with responses but without a large language model (LLM) or neural network architecture, was judged to be human just 22% of the time. GPT-3.5 scored 50%, while the human participant scored 67%.
The Implications
“Machines can confabulate, mashing together plausible ex-post-facto justifications for things, as humans do,” says Nell Watson, an AI researcher at the Institute of Electrical and Electronics Engineers (IEEE). “They can be subject to cognitive biases, bamboozled and manipulated, and are becoming increasingly deceptive.”
The study builds on decades of attempts to get AI agents to pass the Turing test and echoes common concerns that AI systems deemed human will have “widespread social and economic consequences.” The scientists also argued that there are valid criticisms of the Turing test being too simplistic in its approach.
Looking Forward
Watson added that the study represented a challenge for future human-machine interaction and that we will become increasingly paranoid about the true nature of interactions, especially in sensitive matters. She highlighted how AI has changed during the GPT era.
“ELIZA was limited to canned responses, which greatly limited its capabilities. It might fool someone for five minutes, but soon the limitations would become clear,” she said. “Language models are endlessly flexible, able to synthesize responses to a broad range of topics, speak in particular languages or sociolects, and portray themselves with character-driven personality and values. It’s an enormous step forward from something hand-programmed by a human being, no matter how cleverly and carefully.”
This development marks a significant milestone in artificial intelligence and opens up new possibilities for human-machine interaction.
One comment