Is ChatGPT smarter than your Doctor?

Is ChatGPT smarter than your Doctor?

Written by: Pippa Thackeray

|

This article came into existence from a discussion within the Healf team, whereby a relative had used ChatGPT to analyse their blood results before going to the doctor. It was a topic sparking some debate. Is this the way forward? 


Surely AI must have some limitations in terms of knowing what is best for our health? What about the security of sensitive medical data? As we accelerate towards AI dominance in the wellbeing world, it is poignant to consider all sides of the debate and to be prudent about what comes next — is this technological leap going to be our saviour, or are we walking a very thin line?

It’s all about context


Do a quick search online about ChatGPT in healthcare and many expert opinions will come up. On a general level, many of these opinions seem to point towards the context of ChatGPT use and by whom it should be accessed, especially when a person’s health or even their life is under threat.

“Holy cow! This is gonna change medicine…”

Dr. Harvey Castro, Author and Medical Media Expert

Dr. Harvey was working in an emergency room when he observed nurses taking way too long to manually access information that could save a patient’s life. Of course, these nurses were doing the best they could with the resources they had and this was no criticism of their professionalism or training. Yet, under a pressurised and time sensitive situation like this, he realised AI could be the way forward in terms of mitigating the risk of human error and unnecessary delays — it could ultimately save lives.

The limitations of ChatGPT for public use

Once well acquainted with ChatGPT, Dr. Harvey set about researching its potential use in realistic terms. And although he marvels at the possibilities of ChatGPT, he says the big problem with it is that when it doesn't know the answer, it will hallucinate and make one up that is not grounded in truth or evidence.


Dr. Harvey also points out another downside of individuals using ChatGPT as a medical reference and away from a professional context is this: ChatGPT was trained using the internet and information from books aimed at the general public, it was not trained on specific medical knowledge and peer reviewed medical publications such as medical journals. His case for using AI in a professional context, however, differs in that the AI he has developed has been trained on reputable sources. This information was further reinforced by the opinions of practising doctors, adding in a human element to reduce the risk of unhelpful outcomes.


He says the scope of developing new language models for various medical uses is indeed vast. And illustrates this with an example of a language model which could be used in an emergency room (ER), such a model would be developed with real doctors and ER data, meaning it would be superior in comparison to generic ChatGPT.

Language models currently used in healthcare

Language models can be used to create better conversations with your doctor: Dr Harvey highlights his preference for chat functions in that you can use ChatGPT personally, before your appointment to query which questions to ask the practitioner. Examples include getting your hypertension results. The argument for AI uses such as this include the very short amount of time patients get with their doctor: in the UK the average GP visit is under 10 minutes and in the US, under 13 minutes.


Keeping medical doctors in the loop: The medical world is extremely fast-paced and therefore it is a real challenge to ensure that every doctor is on the same page. With medical knowledge now growing at an incredible pace (doubling every 73 days) clinicians are finding it harder than ever to keep up with the floods of new information being published.

Reducing the administration efforts of medical staff: AI tools have the potential to strip away layers of tedious paperwork and administrative burdens, letting healthcare professionals focus their attention on other important aspects of patient care. Many practices across the UK are already adopting ‘chatbot’ systems to funnel their patients and alleviate some of the stress placed on staff . An example on a more global scale includes the optimisation of no-show rates by addressing logistical issues on a very practical level, like offering transportation or ‘telehealth options’.

The ethical debate of GPT use in healthcare

Building upon Dr. Harvey’s perspectives, a panel discussion from the UCSF Department of Medicine, titled"ChatGPT: Will It Transform the World of Health Care?" unearths the realities of using advanced language models and its implications for, not only our health, but for our societal expectations of AI.


Explaining slow AI adoption in healthcare


Before moving onto ChatGPT explicitly, the discussion opens up into avenues such as explaining and questioning the slow AI adoption in healthcare. Panellists attribute this to early attempts being too complex and hence too high risk to be carried out smoothly in clinical applications. As a result, repetitive tasks are mainly among those dominated by AI models at present. Examples of the failings of complex AI include the application of AI to sepsis detection, which has proven challenging and perhaps, to this extent, the “wrong” area for current AI technology to provide within.


AI model limitations and risks of hallucination:


“Hallucination in large language models means exactly what it sounds like. So the model essentially fabricates information while remaining quite persuasive.” – Sara Murray,  Associate Chief Medical Information Officer, UCSF Health  


AI can confidently generate false information. As covered earlier in Dr. Harvey’s take, the issue of hallucinations presents a very big problem, particularly if the language model is being used away from a professional body, such as an established medical organisation who have adapted the technology to optimise its use and are fully regulated regarding issues surrounding data.


This tendency to fabricate information was illustrated by the panel through a specific example of a response recommending inappropriate medication for insomnia. The takeaways from this were to always ensure supervised use, and of course, refinement of the language model in question. Therefore, to some, it may pose the question: are applications such as ChatGPT ready to be used in a position of great responsibility over patient outcomes, and are they really saving on resources if we must use them with near total supervision?


Emerging tools and accuracy checks:

 

In response to concerns around such inaccuracies, tools are being piloted to auto-generate notes from conversations with patients. This may act as a future AI scribe replacement. But it is worth remembering that this tool is also still in its infancy, with issues such as cost and inflexible “specialty-specific performance”, and with variable accuracy. 

Ethical implications and bias in AI model usage

To further tackle this issue, institutions like UCSF could work with their diverse data to build fairer models that more accurately reflect the needs of varied patient groups. Still, keeping an eye on bias as these models are put into practice is absolutely essential.


In terms of future AI usage, in broader terms, the panel concluded that there must be a discerning nature to whom this advanced technology is available to. Panellists such as Atul Butte, voiced their concerns on how medicine could still be practised with the best and purest intentions, with all its core values coming first:


“We shouldn’t just hand this responsibility off to companies… We also have to help these new learners overcome regulatory barriers. Our responsibility has to go into teaching ChatGPT 10 or whatever comes next about how to properly practise medicine” – Atul Butte,  Director of UCSF Bakar Computational Health Sciences Institute 

“I’d give AI two scores, of 1 and 10 (out of 10). As I think it is simultaneously both the most exciting and terrifying thing to come into existence.” Aaron Neinstein, Senior Director, UCSF Center for Digital Health Innovation

Aaron Neinstein, Senior Director, UCSF Center for Digital Health Innovation

Is ChatGPT intelligent? Treating your wellbeing with the highest intelligence

At Healf , we aspire to always bring you the best the wellbeing industry has to offer, priding ourselves on our provision of only the very best brands and products. So, you can be fully confident we're educating and empowering you to make the best wellbeing decision.

Introducing Healf Zone

We’re delivering the future of personalised wellbeing, with blood diagnostics, wearable tech, expert insights, and advanced AI technology all working in sync to give you a complete picture of your wellbeing that you can trust. 


•••

This article is for informational purposes only, even if and regardless of whether it features the advice of physicians and medical practitioners. This article is not, nor is it intended to be, a substitute for professional medical advice, diagnosis, or treatment and should never be relied upon for specific medical advice. The views expressed in this article are the views of the expert and do not necessarily represent the views of Healf