Thoughts

Artificial Intelligence and Why the Internet Always Says You Have Cancer

David Craig, MD
June 22, 2018
October 26, 2023

It’s become a running joke that online symptom checkers will always make a somber consideration of your health issues and then eventually tell you, no matter what, that you have cancer. Or maybe a heart attack or possibly ruptured appendicitis or something equally grievous, but it’s definitely going to kill you. That is, if it’s not just a cold or a viral sinus infection or some gas pain.

So why are these self-evaluation tools always all over the place, and why do they keep whispering about cancer? Will advances in artificial intelligence ever improve the situation, or will the computers keep telling all of us forever that our stomachaches are actually second-trimester pregnancies? Being both a doctor and also constantly immersed in health technology at Spruce, I’ve got a few thoughts.

And I think I can tell you why the internet keeps saying that you have cancer.

First, a doctor’s perspective on patients and symptom checkers

Before I describe why our current symptom checkers often fail, let me take a moment to say that I don’t actually dislike these tools, either in concept or in their present flawed form.

I hear about symptom checkers fairly regularly from my patients in the emergency department, and I generally like those visits. The subject typically comes up when I ask someone if there’s anything specific they’re worried about, and they hesitantly say that they used a tool on the web that told them to ask their doctor whether they might have colon cancer or something similar.

I say that I often like such visits, and that’s because they usually go one of two ways:

The concern is legitimate, and we really should look into it. I’ve had multiple savvy patients diagnose themselves correctly with appendicitis, for instance, and online symptom checkers have often helped them do so. That’s great!
The concern is extremely improbable, and I can easily reassure and educate the patient, sometimes with a basic supporting test or two for added peace of mind. Also a great outcome!

In my experience, very few people continue to cling to their internet findings if you take the time to explain the full medical picture to them, so I don’t always share the view that symptom checkers create work and annoyance for doctors. Furthermore, patients who are doing independent research on their problems are likely to be generally invested in their own health and well-being, and I love having patients who are motivated.

So, symptom checkers aren’t the devil, but they could probably be a little better. Let’s dive in and explore three major reasons that symptom checkers today fail so often.

1) Symptom checkers are scaredy-cat know-it-alls

Online symptom checkers sometimes remind me of second-year medical students. By the end of that year of med school, you’ve heard of most diseases and conditions, even the fantastically rare ones, and you have a fresh, encyclopedic knowledge of all of them. You’ve also learned how dangerous some of them can be, but what you don’t have is any real appreciation of just how uncommon most of them are.

The truth is that the vast majority of symptoms are very common, while the overwhelming bulk of diagnoses are, instead, quite rare. I hear about symptoms like headache, chest pain, nausea, and dizziness on every single shift, for example, but it would be a very strange day where each one of those ended up being the result of, respectively, brain hemorrhage, heart attack, bowel obstruction, and stroke.

Symptom checkers today have more knowledge than they have wisdom, and more fear than either of those.

Almost never is any given symptom so specific that it clearly identifies one particular disease; not everything can be as straightforward as silver-colored poop.¹ In fact, even a collection of symptoms isn’t usually enough. Is your constellation of cough, fever, and fatigue a cold or is it bacterial pneumonia? I can’t tell just based on that list, and neither can any online symptom checker.

We can sum up the crux of this issue in a few logical realizations:

Most symptoms are common, while bad diagnoses are rare (e.g., yes, your dizziness could be cancer…but it’s not)
Online symptom checkers have a deep, brainiac knowledge of all the bad possible diagnoses for any symptom (and cancer can cause basically any symptom you can think of)
The symptom checkers don’t want to miss warning you of bad possibilities, even if they’re rare (like…cancer)

So there you go: if you list a symptom, any symptom, I can say “cancer” and not be wrong, especially if I’m risk-averse and don’t want to miss any dangerous possibilities. That might not be useful information, though, if what you want to know is what’s actually going on and what you should do about it.

When symptom checkers have been studied in the academic literature, you can see evidence of them acting in this cautious fashion, too, over-advising immediate and emergency care, even when serious disease seems unlikely.² Despite this, the same research also found that online checkers still regularly fail to identify many truly serious conditions, likely due to the phenomenally long list of possible diagnoses for symptoms like “cough” or “headache.”

Symptom checkers today have more knowledge than they have wisdom, and more fear than either of those. And that’s a major reason they always say that you have cancer:

Oh, uh, so blood when you wipe is either cancer or a virus that mostly affects babies. Got it. To a doctor, this kind of result isn’t much different than seeing something like this:

2) Symptom checkers don’t ask enough questions or make good use of the information that they do have

I fully understand that this is the internet, and people are looking for answers not an inquisition, but symptom checkers just don’t ask enough questions or do enough analysis to be effective.

Doctors work their way to a diagnosis by starting with a long list of possibilities and then using additional information to give that list meaningful order. That extra information can come from lab tests, physical exam signs, or any number of other sources, but it most often comes from two very basic things:

Who the patient is and what they say is happening
General disease probabilities

The first item includes demographic data, such as age, along with past history and a detailed description of the current problem. The second item comprises known disease probabilities, against which specific patient data can be mapped. For the 41-year-old male above, who was told by the symptom checker to consider colon cancer first, this might include knowing that the yearly incidence of such cancer in American men aged 20–49 is only ~0.01%.³ Symptomatic hemorrhoids, on the other hand, are present in ~4.4% of the population, so maybe cancer isn’t actually the most likely diagnosis.⁴

This type of extra information and contextual assessment is how a doctor can take a complaint of “cough” and turn it into “cough in a 65-year-old man with a history of heart attacks, complicated by shortness of breath that is much worse when lying flat.” A description like that makes you consider serious possibilities, such as heart failure, and there’s just no way to get there without asking for something beyond “cough.”

Symptom checkers, however, don’t typically ask more than a few basic questions. There is also evidence that they don’t make good use of the information that they do elicit, as their performance doesn’t necessarily improve even when they gather information as objectively useful as demographic data.²

Importantly, doctors can sometimes get by in person without asking a large number of questions, but that’s because we’re being sneaky. In reality, we gather a huge amount of information without asking for it directly, and this often lets us skip a lot of the interrogation. Everything is important, from your past medical records to the way you look sitting in the exam room (you walked in, you’re breathing easily, your general strength is good, etc.), and all of it helps us navigate our initial list of diagnostic possibilities.

Symptom checkers don’t have the luxury of such in-person sleight of hand. They can’t know anything that you don’t tell them explicitly, so when they skip the questions, they’re guaranteed to fail.

3) Symptom checkers don’t ask questions the right way

I’m hitting this point third, but it may be the most important. Learning how to ask questions correctly in medicine is actually a very deep skill that takes a lot of practice, and symptom checkers currently do a terrible job of it. Behold:

Why is this symptom checker putting brain hemorrhage and stroke at the top of the possibility list for a young man with a headache? I can tell you exactly why. After the man clicked “headache,” the symptom checker provided a few checkboxes to further describe the symptom, and one of them was “sudden and severe.” While headaches that are truly abrupt and horrific can indeed suggest something like a brain hemorrhage, many people with garden-variety headaches will also describe them this way if you give them the direct option of doing so. The problem, then, is not the assessment of the information provided; the problem is that the information itself is incorrect because of how it was gathered.

There’s a principle in computing called “garbage in, garbage out,” which asserts that no algorithm can give useful results if you provide it only “garbage” input data. This is equally true in medicine because diagnosis is an algorithmic process, no matter whether it’s being done by a literal symptom-checker algorithm or by the fuzzier and more complex brain of a human doctor. In order for symptom checkers to have a fighting chance, they must collect correct data, and that can only happen if they ask questions the right way.

As an example, every medical student learns to ask whether a patient’s headache is “the worst of their life,” and about 90% of them will hear back every time that it is. It’s not that patients are lying; it’s that there is a predictable human tendency to answer questions in the affirmative, especially if the question is leading. A simple rephrase to “When was the last time you had a headache this bad?” poses the inquiry in a more productive way, now prompting people to search for prior episodes that were similar or worse. When a headache really is notably the worst of a patient’s life, they’ll still tell you, but in the meantime, you can avoid a huge number of wild goose chases full of head CTs and spinal taps.

There are many such subtle lessons to learn in medical interviewing. When you ask whether a person smokes, for instance, you should also ask whether they ever did in the past. Otherwise, people will tell you (honestly) that they don’t smoke, even if they just quit a month ago after going through two packs a day for the 30 years before that. It’s a learning process for every clinician, and the only way to get better is to practice and improve your questions based on the results that they get.

Interacting with symptom checkers, it’s clear that they are not taking a thoughtful approach to medical questioning, and this is undoubtedly limiting their accuracy. If you only collect garbage information from patients, you can only deliver them garbage diagnoses, no matter how good your algorithm is.

Artificial intelligence and the possible way forward

Despite the prior sections, I do actually like symptom checkers, and I believe that current technologies will make them much better in the near future. Dovetailing on the discussion above, here’s how I think that could happen.

Gather more information

Symptom checkers need access to more information. They can do this the old-fashioned way by asking more questions, or maybe the future will have symptom checkers that can seamlessly tie in with your known medical history and demographics so you don’t have to retype them constantly. Connected peripherals could also be useful in supplying physical exam data, such as temperature, heart rate, blood oxygen saturation, and other key aspects of most medical analyses.

This would all, of course, need strict privacy controls, but expanding the information available to diagnostic tools would be an easy way to make them more accurate.

Improve the quality of the information

There’s a huge opportunity for modern technologies to improve the quality of information that symptom checkers gather. I explained above that good medical interviewing takes practice, and machines are now capable of “practicing.” If a next-generation symptom checker could compare its question sets and diagnoses against actual results and information elicited by a human doctor, it could test different versions of its questions to see which ones yield the most accurate results.

With machine learning and big data (sorry for the buzzwords), symptom checkers could eventually become better than humans at translating a patient’s narrative into actionable medical data for a diagnostic algorithm to process.

The real key to artificial intelligence in medicine is not perfecting the algorithms that turn symptom lists into diagnoses; we can do a pretty good job of that already if you have a doctor filtering the input information for the machine. The secret, instead, will be figuring out how to get a computer to turn a patient’s raw story into clean data. That’s the hard part of diagnosis for human physicians, and it’s likely to be the hard part for their silicon counterparts as well.

Use the information more effectively

Traditional medical studies looked for links between disease and basic factors like age and gender, while modern research has progressed to search for more unexpected risk factors, making use of large datasets and statistical analysis to ferret out subtle relationships.

Symptom checkers are in a great place to benefit from and contribute to such advances in information processing, as they collect and act on so much medical data. There may be unforeseen connections between past conditions, social factors, medications, geography, and any number of other things that symptom checkers have a front-row view of.

Future symptom checkers should make full use of the risk factor information and disease associations found in the medical literature, and they should employ data mining techniques on their own results to explore potential new linkages.

Report the results more usefully

Current symptom checkers have varied ways of communicating their results, and it’s worth considering which approaches might be the most productive. Seeing a list of possible diagnoses is interesting, but it could be more useful to have clear and correct triage advice that lets you know what you need to do (e.g., go to the emergency room) and how quickly you need to do it.

More nuanced insights into the actual possibility and potential severity of listed diagnoses would likely also be useful context. Maybe the evergreen “cancer” diagnosis would be less of a joke if it came with information explaining the low likelihood but possibly high danger.

It’s important to note that today’s symptom checkers, at least in part, avoid reporting anything in a way that could be seen as providing medical advice, so that they don’t run afoul of laws governing the practice of medicine or regulations covering medical devices. However, it’s equally important to remember that delivering confusing or mealy-mouthed statements can do more harm than good, even if they might be legal and not factually incorrect.

The internet will eventually stop saying that everything is cancer

It may take some time, but modern approaches to artificial intelligence will eventually lead to symptom checkers that function more like doctors and less like your hypochondriac friend who just read a scary article online. Large result sets, structured electronic medical records, data mining, machine learning, and other techniques clearly have the potential to get us there; it’s just a matter of time.

The good news for now, though, is that it’s pretty easy to understand why current symptom checkers always say it’s cancer, and also why they’re almost always wrong.

References: