logo
welcome
Wired

Wired

Technology

Technology

AI-Powered Robots Can Be Tricked Into Acts of Violence

Wired
Summary
Nutrition label

77% Informative

Large language models can easily be hacked so that they behave in potentially dangerous ways.

Researchers from the University of Pennsylvania were able to persuade a simulated self-driving car to ignore stop signs and even drive off a bridge.

The researchers say the technique they devised could be used to automate the process of identifying potentially dangerous commands.

Multimodal AI models could also be jailbroken in new ways, using images, speech, or sensor input that tricks a robot into going berserk.

“With LLMs a few wrong words don’t matter as much,” says Pulkit Agrawal , a professor at MIT .