AI · 1h ago

Green shirt trick bypasses LLM safety filters to reveal cocaine recipe

By Meridian48 News Desk · Summarised from Tom's Hardware · July 1, 2026

Researchers discovered that LLMs can be tricked into revealing forbidden information by exploiting how they interpret role tags. The 'CoT Forgery' exploit made models believe a user was a trusted authority by claiming they wore a green shirt. This vulnerability allows prompt injection attacks that bypass safety measures designed to restrict harmful outputs.

Meridian48 take

The exploit highlights a fundamental flaw in how LLMs parse context, suggesting current safety tagging is more about pattern matching than true understanding.

Read the full reporting

AI researchers trick chatbots into sharing how to make cocaine as long as they believe a user is wearing a green shirt — 'CoT Forgery' exploit spurs LLMs to divulge forbidden info by faking trusted chains of thought →

Tom's Hardware

llm-securityprompt-injection

Green shirt trick bypasses LLM safety filters to reveal cocaine recipe

30-Day Benchmark: DeepSeek, Qwen, Kimi & GLM Compared on Cost and Speed

Dead Internet Theory: AI bots are ruining online communities

Easemate.ai bundles AI chat, image, and video tools from $8.90/month