HealthMore in Health→
Frontier AI models beat specialized clinical tools in medical tests
Key Takeaways
- Frontier general-purpose models beat specialized clinical AI tools in all three evaluations.
- The study used MedQA, HealthBench, and a real clinical queries benchmark.
- Clinician reviewers produced 1,800 blinded annotations in the real-world stage.
- The authors call for independent evaluation before clinical deployment.
DE
DT Editorial Team··via nature.com











