• AwesomeLowlander@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    10
    ·
    12 hours ago

    To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort.

    Today’s breaking news: LLM prompted to blackmail, attempts blackmail. Who woulda thought?

  • MushuChupacabra@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    12 hours ago

    Pffft, what an overreaction.

    The HAL 9000 is the most sophisticated artificial intelligence in the world, and is incapable of malicious action!