人工智能在疾病诊断上开始超越医生
一项新研究表明,OpenAI 的大型语言模型 (LLM) 在诊断心脏供血不足等复杂疾病方面表现优于医生,尤其是在紧急救助阶段,信息有限的情况下。
该模型在早期诊断的准确率约为 67%,而医生的准确率在 50% 至 55% 之间。
研究人员使用了真实患者的案例数据进行测试,并发现 LLM 在诊断能力和临床推理方面的表现都显著高于医生。
尽管如此,专家强调,AI 在医疗领域的广泛应用需要考虑其可靠性,且目前的研究主要集中在短期患者数据和书面病例信息,还需要进一步验证其在长期和更广泛场景下的有效性。
查看原文开头(英文 · 仅前 3 段)
If you walk into an emergency room (ER) in 10 years, you’ll encounter a new type of caregiver: an artificial intelligence (AI) system designed to get you a diagnosis faster and help your care team make more informed decisions. While you sit in the waiting room, you’ll be hooked up to a blood pressure cuff that’s constantly and autonomously monitored. All the while, an AI agent will be listening in while you and your doctor talk about your symptoms, ready to flag any mistakes your physician makes or suggest next steps.
This vision of AI-assisted emergency health care may soon be reality. In a new study, researchers show that a type of AI known as a large language model (LLM) often outperformed physicians at diagnosing complex and potentially life-threatening conditions, including decreased blood flow to the heart, even in the fast-moving stages of real ER care when information is limited, they report today in Science. In early ER cases, the model identified the correct or a very close diagnosis in about 67% of cases, compared with roughly 50% to 55% for physicians. And the technology is only getting better.
“Evaluating AI in medicine demands both depth and breadth across different clinical tasks and settings,” and these authors were able to incorporate both in this study, says Shreya Johri, a computer scientist at the Dana-Farber Cancer Institute who was uninvolved with the new research. Still, she notes, wide adoption of these AI systems in health care will hinge on knowing the contexts in which they’re most reliable.
※ 出于版权考虑,仅引用前 3 段。完整内容请阅读原文。