developmentstoday

Ask DT AI Audio Brief Videos Podcast +

Ask DT AI Audio Brief Videos Podcast

HomeTagsmodel evaluation

#model evaluation

All articles tagged with "model evaluation"

Company

About Us
Advertise
Contact
Editorial Policy

Legal

Terms of Service
Privacy Policy
Cookie Policy
AI & Ethics Policy
Do Not Sell My Info
FAQ
Archive
Site Map

Discover

Newsletter
Ask DT AI
Audio Brief
Videos
Podcast
DT Premium

Connect

Twitter / X
Facebook
LinkedIn
YouTube
Instagram
TikTok
Flipboard
RSS Feed

© 2026 Developments Today. All rights reserved.

model evaluation Articles | Developments Today

Who decides what AI tells you? Campbell Brown, once Meta's news chief, has thoughts | TechCrunch

Campbell Brown’s Forum AI is betting expert-built benchmarks can clean up high-stakes model answers

Forum AI evaluates foundation-model performance on high-stakes subjects such as geopolitics and finance.
The company says it recruits experts to build benchmarks and trains AI judges to reach about 90% agreement with them.
Brown argues major model developers emphasize coding and math more than information accuracy.
She says current models still show sourcing problems, political bias, and missing context on sensitive topics.

DE

DT Editorial Team·May 14, 2026·via techcrunch.com

More in News→

Amid Mythos' hyped cybersecurity prowess, researchers find GPT-5.5 is just as good

GPT-5.5 Matches Mythos Preview in UK Cybersecurity Tests, Challenging the Hype Gap

The UK AI Security Institute says GPT-5.5 reached a similar performance level to Mythos Preview on cyber evaluations.
GPT-5.5 passed 71.4% of expert tasks, compared with 68.6% for Mythos Preview, within the margin of error.
The results suggest cyber risk is rising across frontier models rather than being unique to one system.

DE

DT Editorial Team·May 3, 2026·via arstechnica.com

More in News→