Every Sunday, NPR host Will Shortz, The New York Times’ crossword puzzle guru, gets to quiz thousands of listeners in a long-running segment called the Sunday Puzzle. While written to be solvable without too much foreknowledge, the brainteasers are usually challenging even for skilled contestants. That’s why some experts think they’re a promising way to…
Category: Benchmark
AI, Benchmark, Global IT News, Global Security News, Language Technology Partner Program, Meta, Social, Speech Recognition, Translation
Meta launches new program to improve speech and translation AI
Meta is launching a new program in partnership with UNESCO to collect speech recordings and transcriptions the company said will help the development of future openly available AI. The program, the Language Technology Partner Program, is seeking collaborators who can contribute more than 10 hours of speech recordings with transcriptions, large amounts of written text,…
AI, Benchmark, Global IT News, Global Security News, NPR, npr sunday puzzle, reasoning model, Research, sunday puzzle
These researchers used NPR Sunday Puzzle questions to benchmark AI ‘reasoning’ models
Every Sunday, NPR host Will Shortz, The New York Times’ crossword puzzle guru, gets to quiz thousands of listeners in a long-running segment called the Sunday Puzzle. While written to be solvable without too much foreknowledge, the brainteasers are usually challenging even for skilled contestants. That’s why some experts think they’re a promising way to…
AI, Benchmark, benchmarking, generative ai, Global IT News, Global Security News, In Brief
Even some of the best AI can’t beat this new benchmark
The nonprofit Center for AI Safety (CAIS) and Scale AI, a company that provides a number of data labeling and AI development services, have released a challenging new benchmark for frontier AI systems. The benchmark, called Humanity’s Last Exam, includes thousands of crowdsourced questions touching on subjects like mathematics, humanities, and the natural sciences. To make…
AI, Benchmark, decart, Fundraising, Gaming, GenAI, generative ai, Global IT News, Global Security News, LLM, oasis, Startups
Decart adds another $32M at a $500M+ valuation
A young startup that emerged from stealth less than two months ago with big-name backers and bigger ambitions to make a splash in the world of AI is returning to the spotlight. Decart is building what its CEO and co-founder Dean Leitersdorf describes as “a fully vertically integrated AI research lab,” alongside enterprise and consumer…