Physicians Call Duplicative Training and Trivia Exams Unsustainable
Earlier coverage of ai oversight and its implications for CME providers.
Clinicians drew a clearer boundary this week: AI excels at summarization and pattern recognition, yet CME must deliberately preserve human teaching, empathy, and judgment.
Clinician conversation this week drew a sharper line between tools that surface information and education that changes judgment. AI can summarize patterns, yet the examples show that confidence, reflection, and wisdom still require deliberate human design in CME.
The most useful AI conversation this week was not about whether clinicians should use the tools. It was about what the tools leave untouched.
One clinician put the boundary plainly: “So far AI is great at pattern recognition and summarizing the known, but fails beyond knowledge acquisition which is part of the first two years of med school” (source). The same thread pointed to skills acquisition, motivation, confirming comprehension, emotional intelligence, and wisdom as areas where AI does not substitute for human teaching.
A JAMA podcast on AI in radiology made a similar distinction from a different angle: AI may help with detection, access, and simple imaging questions, but the human work shifts toward communication, interpretation, prognosis, and next steps. Separately, a hematology-oncology thread on LLMs and HSCT framed the tool as a way to decode a specialized field for non-experts, patients, and caregivers—not as a replacement for specialized judgment.
For CME providers, this changes the AI question. The issue is not simply whether AI-generated summaries are accurate enough to use. It is whether activities are preserving the moments where a clinician has to explain uncertainty, recognize a learner’s misconception, respond to emotion, or integrate context that is not reducible to a prompt.
The implication: use AI for compression and preparation, but do not let it remove the facilitated discussion, case debrief, peer comparison, or reflective prompt where professional judgment is actually formed.
The assessment signal came from a single ABFM/JCEHP podcast source, so it should not be treated as broad clinician consensus. Still, it is directly relevant to CME design because it shows how the same questions can produce different learning behavior depending on stakes and confidence prompts.
In the JCEHP Emerging Best Practices in CPD episode, the discussion centered on high- and low-stakes longitudinal assessments that asked physicians not only to answer questions, but to rate confidence before seeing feedback. The key design point is the “confidently wrong” learner: someone whose score alone does not reveal the risk, because the educational problem is misplaced certainty.
That matters because many CME experiences still treat assessment as a post-test artifact. The ABFM/JCEHP discussion suggests a more useful role: low-stakes confidence checks can expose blind spots, while higher-stakes conditions may increase attention, time on task, and resource use. The two formats do different work.
We saw a related pattern in an earlier brief on longitudinal assessments: certification-linked learning can reshape what clinicians expect from CME. This week’s added point is that confidence data may be as useful as correctness data when the goal is to guide learning, personalize feedback, or measure whether an activity changed more than recall.
The implication: if an activity only asks whether the learner got the item right, it may miss the learner who most needs intervention—the one who is wrong and certain.
This week’s signal is not that AI threatens CME or that every activity needs a new assessment engine. It is that automation and assessment both expose the same weakness in many learning products: they can move information quickly without revealing whether judgment changed. The better question for CME teams is simple: where are we using technology to make learning faster, when the harder work is making the learner stop, explain, reconsider, and know when they might be confidently wrong?
Independent clinician thread emphasizing AI's inability to provide motivational teaching or emotional support during learning encounters.
"There is so much more art to medicine that rote memorization fails to be a good clinician. So far AI is great at pattern recognition and summarizing the known, but fails beyond knowledge acquisition which is part of the first two years of med school."
Show captured excerptCollapse excerptJournal podcast episode contrasting AI pattern recognition strengths with the need for human oversight in wisdom and identity development.
Earlier coverage of ai oversight and its implications for CME providers.
Earlier coverage of ai oversight and its implications for CME providers.
Earlier coverage of ai oversight and its implications for CME providers.
ChatCME surfaces the questions clinicians actually ask — so you can build activities that close real knowledge gaps.
Request a demoSecond clinician thread highlighting gaps in skills-acquisition support that AI cannot close.
"Extending our prior work on #LLMs in HemOnc ti #HSCT: collab with our @sitcancer NCI Immunotherapy Fellow @Elix_XYZ and with @GMIannantuonoMD @DBrackenClarke Hyoyoung Choo-Wosoba and @gulleyj1! @NCICCR_MOS"
Show captured excerptCollapse excerptABFM/JCEHP podcast presents performance and confidence data showing high-stakes formats increase resource use and reflection while low-stakes formats reveal hidden gaps via confidence questioning.
Open source