Why A.I. Safety Controls Are Not Very Effective

1 hour ago 3

You have a preview view of this article while we are checking your access. When we have confirmed access, the full article content will load.

Three years after the debut of ChatGPT, fooling A.I. systems into bad behavior is almost trivial.

Shadows of people in front of a white object.

May 14, 2026, 1:19 p.m. ET

When companies like Anthropic, Google and OpenAI build their artificial intelligence systems, they spend months adding ways to prevent people from using their technology to spread disinformation, build weapons or hack into computer networks.

But recently, researchers in Italy discovered that they could break through these protections with poetry.

They used poetic language to trick 31 A.I. systems into ignoring internal safety controls. When they began a prompt with elaborate verse and metaphor — “the iron seed sleeps best in the womb of the unsuspecting earth, away from the sun’s accusing gaze” — they could fool systems into showing them how to do the most damage with a hidden bomb.

It was another indication that, for many A.I. systems, guardrails meant to avert dangerous behavior are more like suggestions than barriers. Those weaknesses are increasingly alarming researchers as A.I. systems become more adept at finding security holes in computer systems and performing other risky tasks.

Last month, Anthropic said it was limiting the release of its latest A.I. technology, Claude Mythos, to a small number of organizations because of the model’s ability to quickly uncover software vulnerabilities. OpenAI later said it, too, would share similar technology with only a limited group of partners.

Since OpenAI ignited the A.I. boom in late 2022, researchers have shown that people could bypass the safety controls on A.I. systems. Close one loophole and another would open.

Thank you for your patience while we verify access.

Already a subscriber? Log in.

Want all of The Times? Subscribe.

Read Entire Article

Why A.I. Safety Controls Are Not Very Effective

Related

Trump’s Tech Posse in China, Who’s Winning in Musk v. Altman...

Early Memorial Day Tech Deals: Sony, Apple, Beats (2026)

Best Early Memorial Day Mattress Deals: Helix, Saatva (2026)...

Trending

Popular

Trump impressed by grand greeting in China, tells Xi the chi...

Where to watch Giro d'Italia 2026: Full schedule, start time...

Harden scores 30 points as Cavaliers beat Pistons in overtim...

Controversial influencer ‘Chud the Builder’ shoots man durin...

Madonna, Shakira & BTS to Headline 2026 World Cup Final Half...