What Should Be the Response for “Doctors Who Code” and Humanity’s Next Medical Exam

In their NEJM AI editorial, Gallifant and Bitterman remind us that only 40% of a physician’s shift is spent in direct patient contact. The rest — the invisible 60% — disappears into bureaucracy, fragmented data, coordination loops, and the friction of misaligned incentives. For those of us building or coding in medicine, that statistic should stop us cold.

It exposes a truth that many clinicians feel but rarely quantify: our burnout isn’t born from medicine itself, but from the administrative scaffolding built around it. And now, artificial intelligence is peering through those cracks, not only to answer clinical questions faster but to reimagine what counts as care.


The Hidden Exam Already in Progress

The authors call for Humanity’s Next Medical Exam — a framework to test not only what AI knows but how wisely it acts. That question should haunt every doctor who codes.

For years, we’ve been training algorithms to pass medical multiple-choice tests — the same way our own education once measured competence. But medicine isn’t a quiz; it’s a dynamic conversation between uncertainty, empathy, and evidence. When AI models begin to outperform trainees, the exam itself must evolve — not toward higher scores, but toward deeper alignment with human values.

As the article puts it, the challenge has shifted from whether machines can know as much as or more than humans, to how we measure and govern their ability to act wisely and in alignment with human values.
That’s not a technical problem alone — it’s an ethical one, a systems one, and, increasingly, a personal one.


Coding for the 60%

For Doctors Who Code, this is the mission space: the 60% of the physician’s life that no stethoscope or standard EHR can heal.
We code not to replace bedside judgment, but to restore it — by reclaiming time, attention, and moral focus.

If AI can reason through a breast cancer case but can’t reason through the chaos of a patient’s chart, then it’s still a child prodigy lost in paperwork.
Our responsibility is to build systems that understand the human workflow, not just the human anatomy.

This means:

  • Creating interoperable data systems that align incentives around the patient, not the payer.
  • Designing ambient tools that listen first, summarizing and synthesizing without drowning clinicians in alerts.
  • Developing evaluation metrics that reward contextual wisdom — the ability of AI to handle ambiguity, ethics, and competing goals, not just clinical accuracy.

Humanity’s Next Exam Is Ours Too

The future exam won’t be taken by machines alone — it will be taken by us.
Can we, as clinicians and coders, align technology with compassion?
Can we resist the pull toward efficiency at the cost of empathy?
Can we build systems that act wisely when no one is watching?

In the end, this “Next Medical Exam” is about trust — not blind trust in algorithms, but earned trust through transparency, reproducibility, and shared human oversight.
Every doctor who codes must study for this test — not by memorizing API calls, but by mastering what it means to keep machines accountable to humanity.

Because if machines are learning to act wisely, we must learn to build wisely too.


Written by Chukwuma Onyeije, MD, FACOG
Founder, CodeCraftMD & Doctors Who Code