Category: pokemon

Not even Pokémon is safe from AI benchmarking controversy. Last week, a post on X went viral, claiming that Google’s latest Gemini model surpassed Anthropic’s flagship Claude model in the original Pokémon video game trilogy. Reportedly, Gemini had reached Lavendar Town in a developer’s Twitch stream; Claude was stuck at Mount Moon as of late…

Anthropic used Pokémon to benchmark its newest AI model

February 24, 2025

Anthropic used Pokémon to benchmark its newest AI model. Yes, really. In a blog post published Monday, Anthropic said that it tested its latest model, Claude 3.7 Sonnet, on the Game Boy classic Pokémon Red. The company equipped the model with basic memory, screen pixel input, and function calls to press buttons and navigate around the…

Category: pokemon

AI, benchmarks, Global Security News, pokemon

Debates over AI benchmarking have reached Pokémon

April 14, 2025

AI, Anthropic, Benchmark, claude 3.7 sonnet, Gaming, Global IT News, Global Security News, pokemon, pokemon red

Anthropic used Pokémon to benchmark its newest AI model

February 24, 2025