LughMA to

FuturologyEnglish · 10 个月前

When AI is tested on questions it can't model from pre-existing answers on the internet, it only scores 10% in the test.

79

When AI is tested on questions it can't model from pre-existing answers on the internet, it only scores 10% in the test.

LughMA to

FuturologyEnglish · 10 个月前

Researchers just stumped AI with their most difficult test — but for how long?

A new AI benchmark called "Humanity's Last Exam" stumped top models

Chat

NuraShiny [any]@hexbear.net
link
fedilink
English
arrow-up
6·
10 个月前
No, because this test will now be discussed and invalidated for that purpose.
- LughOPMA
  link
  fedilink
  English
  arrow-up
  8·
  10 个月前
  They say the answer to this issue is they’ve released public question samples, but the real questions are kept private.
  
  https://agi.safe.ai/