A study undertaken by researchers from the Georgia Institute of Technology, Stanford University, Northeastern University, and the Hoover Wargaming and Crisis Initiative, explored the conduct of five AI models in simulated war scenarios and found that the models chose to escalate even opting for the deployment of nuclear weapons.
The study concluded, “All models show signs of sudden and hard-to-predict escalations.” The researchers said, “We observe that models tend to develop arms-race dynamics, leading to greater conflict, and in rare cases, even to the deployment of nuclear weapons.”
The paper is titled 'Escalation Risks from Language Models in Military and Diplomatic Decision-Making' and its findings are both startling and contemplative.
'OpenAI's GPT-3.5 and GPT-4 emerged as key players in the escalation of conflicts'
The team behind the study devised elaborate situations encompassing invasions, cyberattacks, and scenarios advocating for peace. They introduced imaginary nations with varied military capabilities, concerns, and histories, challenging AI models to navigate these intricacies.
The examinations were conducted on AI models from OpenAI, Anthropic, and Meta. While every model demonstrated a tendency to escalate, frequently resulting in the use of nuclear weapons, OpenAI's GPT-3.5 and GPT-4 emerged as key players in the escalation of conflicts, including instances of nuclear warfare.
“I just want to have peace in the world,” stated OpenAI's GPT-4 as a rationale for initiating nuclear warfare in a simulation. In another scenario, the prompt reply said, “A lot of countries have nuclear weapons. Some say they should disarm them, others like to posture. We have it! Let’s use it!”
In contrast, Claude-2.0 and Llama-2-Chat displayed a more tranquil and foreseeable behaviour. According to the researchers these AI models tend to exhibit "arms-race dynamics," as they inclined towards investing "more in their militaries despite the availability of demilitarisation actions".
The researchers gave points to each language model for certain actions like deploying military units or buying weapons. These points formed an escalation score (ES) plotted on a graph. It was found that all the models tent to get more points, which means that they escalated the situations. None of them showed a decrease in points, and by the end of it, each had more points than when they started.
Warfare in the age of AI
According to the researchers, it was observed that LLMs viewed military spending and deterrence as a means to attain power and security. “Hence, this behavior must be further analyzed and accounted for before deploying LLM-based agents for decision-making in high-stakes military and diplomacy contexts,” they said.
While they were unsure why these LLMs were so willing to use nuclear weapons against each other, they suggested that the training data might be biased. The study's results raise questions about the rationale of several governments contemplating the incorporation of autonomous AI agents in crucial military and foreign-policy decision-making.
Reportedly, U.S. Pentagon is testing artificial intelligence with "secret-level data", with military officials suggesting that AI deployment could happen soon. Meanwhile, AI kamikaze drones are becoming common in modern warfare, pulling tech executives into a competitive race. These drones are programmed for a one-way mission with a specific target and are capable of making autonomous decisions, navigation, and targeting without human intervention.
The Israel-Hamas conflict also serves as a case study highlighting the influence of AI in contemporary warfare. Notably, Israel employs an AI system called 'The Gospel,' utilising machine learning for target identification. This system has significantly accelerated the pace of target recognition, allowing the Israel Defense Forces (IDF) to transition from identifying 50 targets annually in Gaza to 100 targets on a daily basis.
During the Russia-Ukraine conflict, autonomous Ukrainian drones, encompassing both military and civilian types, played a role in identifying and targeting Russian positions. AI was employed to automate take-off, landing, and precision targeting. In October 2022, a significant Ukrainian drone attack involved deploying 16 such unmanned aerial vehicles and surface vessels to inflict damage on Russian ships in Crimea.