Anthropic has employed the classic Game Boy game Pokémon Red to test its latest AI model, Claude 3.7 Sonnet. Unlike its predecessor, Claude 3.0 Sonnet, which struggled to leave the starting area, the updated model successfully battled three gym leaders, demonstrating impressive progress. Equipped with basic memory, screen pixel input, and function calls, Claude 3.7 Sonnet leveraged “extended thinking” to perform 35,000 actions and achieve significant milestones. The company revealed that within hours, the AI defeated Brock and subsequently conquered Misty, showcasing its advanced problem-solving capabilities. Pokémon Red joins a range of games now used to assess AI performance
Articles