Medicine’s AI Knowledge War Heats Up
The Battlegrounds May Surprise You
Many of today’s leading AI use cases – in areas such as clinical documentation, billing, and call centers – are designed to reduce administrative friction. As important as these uses are, the real potential for AI to transform healthcare lies in building effective decision support – helping doctors, as well as patients, make decisions that result in higher-quality, safer, and less expensive care without driving everybody crazy.
This means that, while many of today’s business skirmishes are between companies like Abridge, Ambience, and Nabla (AI scribe companies now broadening their offerings in the face of growing commoditization), the real war will be fought on a different battlefield: the one to become the AI decision support tool of choice.
The competition intensified last week with UpToDate’s announcement of a new AI-enabled feature. With this announcement, UpToDate demonstrated that it was not about to play dead and allow an upstart called OpenEvidence to dominate the field of AI clinical decision support. The battle will be fought over features, no doubt, but it will fundamentally be about a more intriguing question: What is the optimal source of knowledge for the practice of medicine? Here’s why.
A Little Background
The breakneck speed with which OpenEvidence supplanted UpToDate over the past two years to become the go-to resource for clinical decision-support gave me an odd sense of déjà vu. I’ve seen this before, I thought. But where?
And then I remembered – it was about 25 years ago, when UpToDate did precisely the same thing to the textbooks that I grew up with in medical school.
Before the mid-1990s, everybody had their favorite textbook – every doctor was either a Harrison’s or a Cecil person. Then, seemingly overnight, the tomes (including one I edited, Hospital Medicine) began gathering dust on shelves everywhere, as a new kind of tool easily demonstrated its superiority in point-of-care medical knowledge retrieval.
Today, UpToDate remains an extraordinary resource, created by over 7,500 human experts charged with culling and interpreting the medical literature and guidelines to produce chapters on every conceivable clinical topic – and keeping the chapters, as the name says, up to date. From the time it emerged in the late 1990s until about 2023, it deservedly had what felt like an unshakeable position as the dominant point-of-care tool for health systems and clinicians.
And then, before you could say Clayton Christensen, OpenEvidence displaced UpToDate – particularly among doctors-in-training, often the vanguard of tech-driven change – because it could perform a trick that UpToDate couldn’t: take an entire clinical case, in all its staggering complexity, and produce an AI-generated “curbside consult” that was impressively accurate, context-specific, and, yes, up to date. Suddenly, UpToDate’s approach, which had seemed revolutionary a generation earlier, seemed stale.
Two Very Different Approaches
Particularly for those of you who aren’t clinicians, let me demonstrate what I mean. Here’s a relatively complex case I saw a couple of years ago when I was visiting professor at Yale. I’ll present it briefly, as I might if I had run into my favorite hepatologist in the hospital cafeteria and asked her for a curbside consult:
“I’m caring for a 75-year-old man with Waldenstrom’s macroglobulinemia who came in with fever, hypoxia, and pulmonary infiltrates. We started him on Zosyn, and he rapidly developed liver dysfunction, with transaminases in the thousands but normal alk phos and bili. What do you think is going on?”
If that was Greek to you, here’s a lay explanation: the case involves an elderly gentleman with a rare, chronic blood disorder, an acute lung problem that is probably (but not definitely) a pneumonia, and now an unhappy liver with a distinctive pattern of blood tests.
“All happy families are alike; each unhappy family is unhappy in its own way,” reads Tolstoy’s first sentence in Anna Karenina. As with many patients, particularly those who are acutely ill, this one’s body was unhappy in its very own way. Accordingly, the answer won’t be found in a single textbook chapter.
When I entered my long prompt in UpToDate, it produced links to four of its chapters:
· Approach to the immunocompromised patient with fever and pulmonary infiltrates
· Epidemiology, pathogenesis, clinical manifestations, and diagnosis of Waldenstrom macroglobulinemia
· Sepsis in children: definition, clinical manifestations, and diagnosis
· Evaluation and management of fever in children and adults with sickle cell disease
The first two are off point – my question was about why this patient has a failing liver, not about what’s going on in his lungs or what his rare blood disorder is about. The last two links, to pediatric chapters, are bizarre, since my query clearly concerned a 75-year-old man.
Recognizing that I was asking UpToDate to do something it isn’t built to do (grapple with a complex case, rather than a narrowly framed problem), I tried deconstructing my query into a single focused question: “Can Zosyn cause liver injury?” This time, UpToDate took me to its chapter on Zosyn (an intravenous antibiotic), which was fine but didn’t answer my question. Several more links followed, most of them clinically irrelevant. The most on-point link was seventh on the list: a chapter on acute liver injury.
In contrast, when I put my original prompt into OpenEvidence, it “thought” for about 10 seconds and then gave me an impressive answer that considered the entire case presentation. First, it reviewed the most likely diagnoses, which it considered to be ischemic hepatitis (“shock liver”), drug-induced hepatocellular injury, and sepsis-associated liver injury. It then offered several other “Most Important Not to Miss Diagnoses,” including Tylenol overdose, Budd-Chiari syndrome (hepatic vein thrombosis), and fulminant viral hepatitis. These two lists mirror the way experienced clinicians approach clinical cases – we often create lists of both the most likely diagnoses and those that might be fatal if missed.
Unsurprisingly, the difference between OpenEvidence’s results vs. those from UpToDate is reminiscent of the difference between the links delivered by a traditional Google search and the results of a query on GPT, Gemini, Claude, or any modern general-purpose large language model. The new tools permit an input that reflects real-world complexity, and their output is a human-like synthesis, not a series of links. Like the iPhone, before you saw it, you didn’t know you needed it; afterwards, you don’t know how you operated without it.
Disruption: What Goes Around…
It’s always fascinating to see legacy businesses try to adapt to an upstart that threatens to upend their business models. In the late 1990s and early 2000s, as UpToDate was gaining momentum, the publishers of medical textbooks didn’t sit still. Within a couple of years, most of them created digital platforms that blended all their textbooks into a single searchable database. Using that platform approach (Elsevier’s version was called MD Consult, later rebranded into ClinicalKey; McGraw-Hill’s was AccessMedicine), the publishers remained viable, although their legacy textbooks are now used more for reference and teaching than for answering point-of-care clinical questions. That latter category has, until recently, been dominated by UpToDate.
The huge Dutch publishing house Wolters Kluwer took a different approach than Elsevier’s and McGraw-Hill’s, and probably the wiser one: in 2008, the conglomerate purchased UpToDate for an undisclosed sum rumored to be in the hundreds of millions of dollars. If you can’t beat them, buy them, I guess.
Thus my sense of déjà vu, as OpenEvidence is currently doing to UpToDate what UpToDate did to the textbook publishers. Based on its investments to date, OpenEvidence has been valued at $6 billion; one wonders whether that valuation was based partly on the assumption that UpToDate would sit back and allow its lunch to be eaten by its upstart rival.
If so, investors in OpenEvidence may be in for a rude awakening. In last week’s announcement, UpToDate said it would soon roll out its own AI-based tool, called UpToDate Expert AI. Note the careful branding, designed to highlight the fact that the UpToDate tool won’t scour the entire medical literature or the unfiltered internet for insights. Instead, UpToDate’s AI will draw its wisdom exclusively from its thousands of continuously updated chapters, written by experts.
What is the Optimal Source of Truth?
This raises a question that I can’t answer, at least not yet: Now that UpToDate is adding genAI capabilities, which tool will provide better results? Will it be tools like OpenEvidence, whose AI reviews the world’s medical literature via searches of journals and society guidelines, and then applies genAI to create answers and references? (Note that OpenEvidence’s process isn’t entirely devoid of human touch. As Daniel Nadler, founder and CEO of OpenEvidence told me, OpenEvidence’s content is honed through “Reinforcement Learning from Human Feedback,” RLHF, a process in which humans – both hired staff and clinician end-users of the tool – assess and tune the AI-drafted answers, ultimately teaching the AI to give better responses.)
Or will UpToDate’s more human-crafted approach – in which the AI limits its source of truth to the chapters in the UpToDate dataset – produce better results? In announcing the new AI tool, Wolters Kluwer Chief Medical Officer Peter Bonis said of UpToDate’s content experts, “They understand the intersection of evidence, real-world patient care, the fact that there isn’t a randomized study for everything, and they have judgment.” He didn’t mention OpenEvidence, but he didn’t need to.
Finally, there’ll be another competitor, one that will approach decision support from a very different angle. Epic, the largest electronic health record (EHR) vendor, also announced a new set of AI tools last month. One of them, named “Art” for clinicians, is designed to review individual patients’ records as well as Epic’s database (“Cosmos”) of deidentified records on millions of patients cared for in health systems that use the company’s EHR.
Over time, one assumes Epic will be able to mine two sources of data unavailable to UpToDate, OpenEvidence, or any of the general large language models: a) the diagnostic and treatment strategies of tens of thousands of clinicians that use Epic’s EHR, and b) the clinical courses of millions of patients in Cosmos. Before long, Epic’s AI may be able to tell a clinician that “patients like yours did better on drug A than drug B” and “92 percent of doctors like you prescribed drug X in this situation.” Personally, I’m not confident these approaches will be supremely helpful, but I’m open to being surprised.
The Promise of EHR Integration
The AI tools being adopted in healthcare today – such as digital scribes, AI chart review, and prior authorization assistance – are helpful, but the real impact and savings will be in this clinical decision support, particularly once we figure out how to tightly integrate it into the EHR.
This last point is crucial. Currently, although UpToDate is a click away on most clinicians’ Epic, Athena, or Oracle desktops (increasingly, OpenEvidence is as well), these tools aren’t integrated into EHRs in any meaningful way. The clinician interested in the best treatment for high blood pressure or the proper tests to rule out lymphoma needs to leave their EHR workflow and conduct a search of UpToDate or OpenEvidence for the answer. And that search won’t know anything about the patient at hand beyond what the clinician types into the search box.
Imagine a world in which the clinician doesn’t need to enter, “This is a 75-year-old man with Waldenstrom’s who presents with fever and pulmonary infiltrates and develops elevated liver tests,” but rather one in which AI is “reading” the patient’s chart automatically, so that it “knows” all that contextual information. The AI tool might suggest – and perhaps even “tee up” – recommended diagnostic tests or treatments based on its review of the patient’s actual data.
It might even allow the clinician to ask very specific questions (“Review the blood pressure trends for this patient and tell me if shock liver is a possible explanation for the abnormal liver tests” or “Given when the patient received the first dose of Zosyn, what are the chances this is drug-induced liver injury?”) based on the patient’s data. Assuming the AI produces trustworthy insights and recommendations, this kind of EHR integration has the capacity to transform the practice of medicine. Of course, the source of the data it uses for the search will ultimately determine whether the results are, in fact, trustworthy.
In the chart below, I’ve summarized the approaches of different companies and the databases their AI systems will utilize to develop their answers.
Why Does This Matter?
The stakes couldn’t be higher – clinicians drive roughly 80% of healthcare spending, about $4 trillion annually in the US alone, while directly determining the quality and safety of care for every patient. The company that becomes the “Intel Inside” of clinical decision-making will wield enormous influence over both patient outcomes and healthcare economics at a massive scale.
OpenEvidence may have sprinted ahead early in the generative AI race, but UpToDate’s and Epic’s recent strategic moves signal that this competition will be fierce, high-stakes and, I believe, ultimately beneficial for patients, clinicians, and the broader healthcare system. It will be exciting to see which of these tools – or perhaps one we’ve not yet heard of – produces the best results.





Dr. Wachter - Thanks for sharing this analysis. As a non-clinical researcher, I appreciated seeing this from a provider’s perspective. I’m a huge fan of yours and incredibly grateful for all of the advice you shared about your own practices to avoid COVID during the pandemic. I work for the Christensen Institute, so I was thrilled to see your reference to Clay in your article. I also recently wrote a piece on OpenEvidence’s disruptive potential, as compared to UpToDate. I wrote it before UpToDate released it’s Expert AI tool, so that’s not included. However, I’d love your thoughts: https://open.substack.com/pub/annsomershogg/p/health-cares-next-game-changer-isnt
While I propose that OE might win for a different reason than being a true Disruptive Innovation, I agree with your assessment that it may follow a similar trajectory to UpToDate displacing text books. And that’s not necessarily because it’s disruptive, but instead because it nails the Job to Be Done.
There's no discussion given here to the possibility that foundation models will ever get good enough to do this job out of the box. I'm not so sure. They are a) getting better faster b) have a lot of incentive to develop guardrails for "compliance" heavy businesses like finance and healthcare. In Clayton Christenson terms, the risk to OE might be from "beneath" not "above"
But engaging on the question as posed-
It's extremely unlikely that UpToDate-based AI will be as good as OpenEvidence, as long as OE avoids platform decay (aka "enshitification") that is the cardinal sin of two-sided attention marketplaces
Both models will be generally trained LLMs with post-training/ RAG/ system prompts/ guardrails/ inference on top, with one drawing most strength from human-curated summaries, the other more directly from the literature/ trials/ guidelines.
[In other words, I'm willing to bet this is will not be true "UpToDate tool won’t scour the entire medical literature or the unfiltered internet for insights. Instead, UpToDate’s AI will draw its wisdom exclusively from its thousands of continuously updated chapters, written by experts."]
There is no way that those Paul Bunyan humans will be able to compete on scale and speed, and the most likely outcome is that what they paste into their updates will increasingly be OpenEvidence generated anyway.
The observationally trained model (Epic corpus) will also start with general knowledge (much as medical doctors start as adult humans with bachelors degrees)- but the interesting question will be whether the experience of real world "good doctors" (not the average or "doctors like you") will be able to be harnessed for greater acceptability/ feasibility realism - especially if it deviates from the evidence base to a significant degree ("I know the CHF GDMT calls for 3 meds, but go slow in elderly patients").
IMO the real question is not whether UpToDate or Epic can leverage their *content* but whether they will be able to leverage their *distribution* to pose a challenge to OE; how expensive / difficult it will be for them to replicate the exact model-building, guardrails, RAG, prompting approach that OE has taken; and how tainted OE will become from a pharma-ad driven business model.