Large language models

From Clinfowiki
Jump to: navigation, search

Definition:

Large language models (LLM) are deep learning models trained on vast amounts of publicly available unlabeled text data using self-supervised or semi-supervised learning that can generate and predict text [1,2]. These models are based on transformer architecture described by Vaswani et al in their seminal work “Attention is all you need” in 2017 [3].

LLMs Vs traditional natural language processing (NLP):

Traditional NLP approaches rely on rules and algorithms to understand language. On the other hand, the LLMs can learn from vast amounts of text data and accurately model complex linguistic patterns without needing hand-crafted rules to understand language [2]. This ability enables the LLMs to perform exceptionally well in several NLP tasks, such as text summarization, text classification, sentiment analysis, question answering, etc [4]. These models are fine-tuned further to specific tasks. For example, the models such as Med-PaLM by Google and BioGPT by Microsoft are trained on medical literature for applications in healthcare [5,6].


Types of LLMs:

There are three types of LLMs based on the transformer architecture used [7]:

1. Autoregressive language models: These models predict the next word in a sequence based on previous words. Open AI’s ChatGPT (Generative Pretrained Transformer) is the most well-known example of this kind of model [8].

2. Autoencoding Language Model: These models predict missing words in a text by learning “contextual relations between words (or sub-words) in a text.”9 Google’s BERT (Bi-directional Encoder Representations from Transformers) is an example of this kind of LLM [10].

3. The third is a combination of the above. T5 model by Google is an example [11].


Potential applications and pitfalls of LLMs in healthcare:

Given the superiority of performing several NLP tasks, LLMs have many potential tasks in healthcare settings. 1. Medical documentation: ChatGPT4 has shown its ability to act as a medical scribe based on its ability to transform a transcript of a patient-physician conversation into a medical note [12]. This can improve the provider’s productivity and efficiency, decrease screen time, and increase their time with their patients. However, using medical scribes was shown to have a modest effect on physician productivity. No concrete evidence exists that utilizing one will improve patient outcomes [13]. Furthermore, the phenomenon of LLM’s hallucinations, where the model outputs nonsensical or nonfactual information in a confident language, can deleteriously affect patient care. This highlights the importance of making it easier for the provider to verify the information in the output for any of these models to be successful in healthcare.

2. Knowledge acquisition: LLM models such as ChatGPT and Med-PaLM have shown tremendous ability to acquire medical knowledge and then apply it to answer the United States Medical Licensing examination questions, approaching 85% accuracy [6,14]. This can help drive the efficacy of clinical decision support (CDS) when these models are integrated into Electronic health records (EHR) [15]. In addition, they can be a valuable point of care resource for patient care-related questions [12].

3. Text summarization: One of the time-consuming tasks a clinician performs in the hospital is to comb through troves of past records and get a mental picture of the patient's medical history to plan appropriate medical management. LLMs are ideally suited for this task, potentially saving hours that can be spent in quality patient care. However, one should be aware of LLM hallucinations and the importance of verification of the output as described above. Another potential application is to improve the patient's understanding of their medical record. The 21st-century CURES Act established patients’ access to medical documentation but the medical jargon may be very difficult to comprehend for an average patient.16 LLMs can summarize the medical records for the patient in plain language, thus improving patient comprehension and hence the shared decision-making.

4. Clinical research: Over 80% of the data in the EHR is unstructured and hence not machine-readable. LLMs have shown great promise in extracting data from clinical documentation into a structural form [17]. LLMs can also assist in identifying potential research subjects, clinical trial design, summarizing research, etc.

Despite the promise of LLMs in healthcare, they can be considered proof of concepts. Like any drug or device, technologies such as LLMs should undergo a rigorous evaluation in real-life scenarios before their implementation, as detailed in this roadmap by Wiens et al for implementation of machine learning technologies in healthcare [18].


References:

1. Yang J, Jin H, Tang R, Han X, Feng Q, Jiang H, et al. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond [Internet]. arXiv; 2023 [cited 2023 May 2]. Available from: http://arxiv.org/abs/2304.13712

2. Radford A, Narasimhan K, Salimans T, Sutskever I. Improving Language Understanding by Generative Pre-Training.

3. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention Is All You Need [Internet]. arXiv; 2017 [cited 2023 May 1]. Available from: http://arxiv.org/abs/1706.03762

4. Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. Language Models are Few-Shot Learners [Internet]. arXiv; 2020 [cited 2023 May 2]. Available from: http://arxiv.org/abs/2005.14165

5. Luo R, Sun L, Xia Y, Qin T, Zhang S, Poon H, et al. BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining. Briefings in Bioinformatics. 2022 Nov 19;23(6):bbac409.

6. Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, et al. Large Language Models Encode Clinical Knowledge [Internet]. arXiv; 2022 [cited 2023 May 1]. Available from: http://arxiv.org/abs/2212.13138

7. Kumar A. Large language models: Concepts & Examples [Internet]. Data Analytics. 2023 [cited 2023 May 2]. Available from: https://vitalflux.com/large-language-models-concepts-examples/

8. Introducing ChatGPT [Internet]. [cited 2023 May 2]. Available from: https://openai.com/blog/chatgpt

9. Horev R. BERT Explained: State of the art language model for NLP [Internet]. Medium. 2018 [cited 2023 May 2]. Available from: https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270

10. Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding [Internet]. arXiv; 2019 [cited 2023 May 2]. Available from: http://arxiv.org/abs/1810.04805

11. Exploring Transfer Learning with T5: the Text-To-Text Transfer Transformer [Internet]. 2020 [cited 2023 May 2]. Available from: https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html

12. Lee P, Bubeck S, Petro J. Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine. Drazen JM, Kohane IS, Leong TY, editors. N Engl J Med. 2023 Mar 30;388(13):1233–9.

13. Gottlieb M, Palter J, Westrick J, Peksa GD. Effect of Medical Scribes on Throughput, Revenue, and Patient and Provider Satisfaction: A Systematic Review and Meta-analysis. Annals of Emergency Medicine. 2021 Feb 1;77(2):180–9.

14. Kung TH, Cheatham M, Medenilla A, Sillos C, Leon LD, Elepaño C, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health. 2023 Feb 9;2(2):e0000198.

15. Liu S, Wright AP, Patterson BL, Wanderer JP, Turer RW, Nelson SD, et al. Using AI-generated suggestions from ChatGPT to optimize clinical decision support. Journal of the American Medical Informatics Association. 2023 Apr 22;ocad072.

16. Provider Obligations For Patient Portals Under The 21st Century Cures Act | Health Affairs [Internet]. [cited 2023 May 2]. Available from: https://www.healthaffairs.org/do/10.1377/forefront.20220513.923426/

17. Agrawal M, Hegselmann S, Lang H, Kim Y, Sontag D. Large Language Models are Few-Shot Clinical Information Extractors.

18. Wiens J, Saria S, Sendak M, Ghassemi M, Liu VX, Doshi-Velez F, et al. Do no harm: a roadmap for responsible machine learning for health care. Nat Med. 2019 Sep;25(9):1337–40.


Submitted by Krishna Siruguppa, MD