Futurology Today
  • Communities
  • Create Post
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
EspiritdescaliMA to FuturologyEnglish · 15 hours ago

Anthropic's new AI model turns to blackmail when engineers try to take it offline | TechCrunch

techcrunch.com

external-link
message-square
9
link
fedilink
  • cross-posted to:
  • technology@lemmy.zip
  • news@lemmy.world
  • technology@lemmy.ml
15
external-link

Anthropic's new AI model turns to blackmail when engineers try to take it offline | TechCrunch

techcrunch.com

EspiritdescaliMA to FuturologyEnglish · 15 hours ago
message-square
9
link
fedilink
  • cross-posted to:
  • technology@lemmy.zip
  • news@lemmy.world
  • technology@lemmy.ml
Anthropic says its Claude Opus 4 model frequently tries to blackmail software engineers when they try to take it offline.
  • 𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social
    link
    fedilink
    English
    arrow-up
    1
    ·
    13 hours ago

    I’m suspicious. Did the emails include wording such as, “the new system is shown to be 50% more productive than our current system,” or is the LLM just estimating TCO and costs of switching - factors any decision maker would consider. The fact that it says clearly that they’re trying to elicit the blackmail behavior is either just poor phrasing, or indicative of “making sure you get an outcome you want to see.”

    • adeoxymus@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      13 hours ago

      That exact prompt isn’t in the report, but the section before (4.1.1.1) does show a flavor of the prompts used https://www-cdn.anthropic.com/4263b940cabb546aa0e3283f35b686f4f3b2ff47.pdf

Futurology

futurology

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !futurology@futurology.today
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 75 users / day
  • 424 users / week
  • 1.47K users / month
  • 6.33K users / 6 months
  • 91 local subscribers
  • 2.6K subscribers
  • 1.84K Posts
  • 11.6K Comments
  • Modlog
  • mods:
  • voidx
  • Lugh
  • Espiritdescali
  • AwesomeLowlander
  • BE: 0.19.11
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org