A discrepancy between first- and third-party benchmark results for OpenAI’s o3 AI model is raising questions about the company’s transparency and model testing practices. When OpenAI unveiled o3 in December, the company claimed the model could answer just over a fourth of questions on FrontierMath, a challenging set of math problems. That score blew the…
Category: o3
AI, ai benchmarks, benchmarking, controversy, epoch ai, frontiermath, generative ai, Global IT News, Global Security News, o3, openai
AI benchmarking organization criticized for waiting to disclose funding from OpenAI
An organization developing math benchmarks for AI didn’t disclose that it had received funding from OpenAI until relatively recently, drawing allegations of impropriety from some in the AI community. Epoch AI, a nonprofit primarily funded by Open Philanthropy, a research and grantmaking foundation, revealed on December 20 that OpenAI had supported the creation of FrontierMath.…
AI, ai models, AI reasoning models, ChatGPT, Global IT News, Global Security News, o3, openai, Startups, TC
OpenAI’s o3 suggests AI models are scaling in new ways — but so are the costs
Last month, AI founders and investors told TechCrunch that we’re now in the “second era of scaling laws,” noting how established methods of improving AI models were showing diminishing returns. One promising new method they suggested could keep gains was “test-time scaling,” which seems to be what’s behind the performance of OpenAI’s o3 model –…