You must log in or register to comment.
To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort.
Today’s breaking news: LLM prompted to blackmail, attempts blackmail. Who woulda thought?
Pffft, what an overreaction.
The HAL 9000 is the most sophisticated artificial intelligence in the world, and is incapable of malicious action!