Forget about ChatGPT acing your homework – AI researchers are now crowdsourcing ‘Humanity’s Last Exam’ to create the ultimate challenge for next-gen tech! As AI systems crush existing benchmarks like college exams and math competitions, experts are asking: ‘What questions can’t machines answer yet?’
The project, led by the Center for AI Safety and Scale AI, aims to create a future-proof test measuring true expert-level intelligence. ‘Common benchmarks are becoming obsolete faster than TikTok trends,’ said organizer Dan Hendrycks, who helped create earlier AI tests now dominated by Anthropic’s Claude (scoring 89% vs. 77% in one year!).
Recent Stanford research reveals AI still struggles with complex planning and visual puzzles – OpenAI’s new ‘o1’ model scored just 21% on pattern recognition tests! The new exam will focus on abstract reasoning questions kept secret from training data, with $5K prizes for the toughest submissions.
‘We need tests that don’t become irrelevant when AI evolves tomorrow,’ said Scale AI CEO Alexandr Wang. Pro tip for contributors: No weapons-related questions allowed – even super-smart AI shouldn’t study those!
Reference(s):
AI experts ready 'Humanity's Last Exam' to stump powerful tech
cgtn.com