Doctors Skeptical of Their Peers Who Use Generative AI

Turning to AI for verification rather than as a primary decision-making tool lessened the poor impression, but only partially.

Doctors Skeptical of Their Peers Who Use Generative AI

Though clinicians see value in generative artificial intelligence (AI) tools for supporting medical decision-making, they also perceive their colleagues who use the technology as less skilled, a study shows.

In the experiment, physicians who used generative AI as a primary decision-making tool garnered significantly lower ratings for clinical skills, competence, and overall healthcare experience delivered compared with physicians who didn’t use AI at all, lead author Haiyang Yang, PhD (Johns Hopkins University, Baltimore, MD), and colleagues report in a paper published online recently in npj Digital Medicine.

Framing AI only as a verification aid improved the ratings somewhat, but they were still worse than the ratings for physicians who didn’t turn to the tools for help.

“As AI tools become more commonly used in healthcare and in medicine, I think this really just demonstrates that there are going to be challenges, some barriers to adoption and increasing use,” senior author Risa Wolf, MD (Johns Hopkins University, Baltimore, MD), told TCTMD.

“But it also highlights that there need to be thoughtful approaches to its implementation,” she added. “People need to understand the specific AI tool that we’re using—what it does, how it can help us—making sure that it’s equitable in its use, that it’s going to be helpful, and that it’s not going to exacerbate issues or disparities that might exist.”

Since the introduction of ChatGPT in November 2022, use of generative AI has increased dramatically around the world, including within the medical community. As of early 2024, according to the researchers, more than 70% of healthcare organizations were either moving toward adoption of AI or had already integrated it into their workflows.

Older computerized tools designed to aid clinical decision-making faced various barriers, but generative AI “marks a major shift, with its ability to process free-form, unstructured data, produce human-like responses, and provide rapid insights, offering a more flexible and accessible tool for decision support,” Yang et al write.

AI applications in medicine remain somewhat limited, however, with one potential barrier being concerns among physicians that their reputations will take a hit if they use the tools, the researchers suggest.

People need to understand the specific AI tool that we’re using—what it does, how it can help us. Risa Wolf

Yang, Wolf, and colleagues explored that possibility in a randomized experiment that involved 276 practicing clinicians—including 178 physicians, 28 fellows/residents, 60 advanced practice providers, and 10 individuals in other clinical roles—through the Johns Hopkins Medicine system. The participants were presented with a vignette in which a physician assessed a patient with diabetes and recommended a new antihyperglycemic drug and were randomized to see it within three different contexts:

  • No use of generative AI (control)
  • Generative AI used a primary decision-making tool
  • Generative AI used to verify the physicians clinical assessment

On a scale of 1 to 7, physicians who used AI for primary decision-making received lower ratings of clinical skill compared with those who didn’t use AI (mean 3.79 vs 5.93; P < 0.001). Physicians who used AI for verification fell in between (mean 4.99).

This pattern also was seen for ratings of competence and overall healthcare experience delivered, although these were mediated by the assessments of clinical skills—ie, negative assessments of clinical skills also dragged down ratings in these domains.

Weakness or Strength?

The findings, the researchers say, are consistent “with a broader body of advice-taking literature, which shows that reliance on external input can be perceived as a weakness rather than a strength.”

That’s despite the fact that the clinicians participating in the study rated generative AI as useful for ensuring the accuracy of a physician’s clinical assessment, both overall (mean 4.30) and when AI tools are customized for their own institution (mean 4.96).

“These findings suggest that while clinicians see GenAI as helpful, its use can negatively impact peer evaluations,” Yang et al write.

And that’s not necessarily fair, Wolf said. “With AI and GenAI becoming more ubiquitous, I think that more people are going to be using it. And I think that our perspective and our approach to this will have to change.”

For a primary care physician in a rural area that doesn’t have many subspecialists around, for instance, generative AI could prove to be a very useful resource in certain cases, Wolf said.

There is, however, still a big gap between development of AI tools and clinical implementation, she noted. “But I think that’s going to change over the coming years,” she said. “The way we perceive this and the way that we implement this has to be very thoughtful. But we also have to start to really engage with AI and also train the next generation of physicians on how to safely and effectively use AI, because I think it’s here to stay.”

Todd Neale is the Associate News Editor for TCTMD and a Senior Medical Journalist. He got his start in journalism at …

Read Full Bio
Sources
Disclosures
  • The study was supported by a Johns Hopkins Discovery Award.
  • Wolf reports research support from Novo Nordisk, Lilly Diabetes, and Sanofi unrelated to the current work.

Comments