Clinical Applications of Machine Learning for Diagnosis
Machine learning is a collection of computer algorithms and techniques that allow computers to learn from data (Kirk, 2017).
- Machine learning
- Use of a computerized algorithm to determine associations between factors and an outcome
- Artificial Neural networks
- A computing approach inspired by biological neural networks. It is a network of connected units, similar to neurons and synapses (Mukherjee, 2017; wikipedia:Artificial neural network).
Framingham Risk Score for coronary heart disease (Deo, 2015)
This is probably one of the oldest and/or best known examples of machine learning in medicine. It took the data from the original Framing cohort and the Framingham Offspring Study and performed statistical analysis, including linear regression, logistic regression, age-adjustment, to arrive at a scoring mechanism for the prediction of coronary heart disease (Wilson et al., 1998).
Diagnosis of pathological specimen is a skill that requires long training, and even then has at times low inter-observer agreement.
Google has tested their deep learning tool "Inception (aka GoogLeNet)" for the pathological detection of metastasized breast cancer. They trained a model that was able to perform at least as well as trained pathologists in most regards. A trade-off between increasing sensitivity and allowing false-positive results was observed (Stumpe and Peng, 2017, Liu et al., 2017).
Esteva et al. report results of training a neural network on skin lesions with their disease labels. The neural network achieved a similar performance to a comparison group of 21 dermatologists for classification of keratinocyte carcinomas versus benign seborrheic keratoses and malignant melanomas versus benign nevi (Esteva et al., 2017).
Gulshan et al. (2016) studied the performance of a neural network for identification of diabetic retinopathy. The findings were compared against the assessment by board-certified ophthalmologists. The algorithm achieved a sensitivity of 87-97.5%, and a specificity of 93.4-98.5%, depending on the configuration of the operating point (i.e. trade-off between sensitivity and specificity).
- When machine learning is applied to known outcomes (supervised learning), there may be uncertainty and/or variation in the determination of those outcomes, even by specialists. One example is interobserver disagreement when classifying tumor cells (Svensson et al., 2015).
- Some machine learning algorithms (e.g. artificial neural networks) do not allow understanding of the reason why certain features influence the prediction. This is commonly compared to a black box because no insight into the inner workings is possible that would provide a rationale for the results (Mukherjee, 2017).
- Overfitting occurs when random noise (which is part of any training dataset) enters the prediction. It generally occurs when too many and/or non-relevant predictors are used in the training, and it endangers the generalizability of the results of machine learning. Overfitting is typically identified when the algorithm is applied to a new dataset (Cook and Ranstam, 2016).
Cook JA, Ranstam J (2016). Overfitting. British Journal of Surgery;103(13):1814. DOI: 10.1002/bjs.10244
Deo, Rahul C (2015). Machine Learning in Medicine. Circulation;132:1920-1930. DOI: https://doi.org/10.1161/CIRCULATIONAHA.115.001593. Retrieved from: http://circ.ahajournals.org/content/132/20/1920.long
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017). Nature;542:115-118. DOI: 10.1038/nature21056. Retrieved from: https://www.nature.com/nature/journal/v542/n7639/full/nature21056.html
Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, Venugopalan S, Widner K, Madams T, Cuadros J, Kim R, Raman R, Nelson PC, Mega JL, Webster DR (2016). Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA;316(22):2402-2410. DOI: 10.1001/jama.2016.17216. Retrieved from: https://jamanetwork.com/journals/jama/article-abstract/2588763
Kirk, Matthew (2017). Thoughtful Machine Learning with Python. Sebastopol, CA: O'Reilly Media, Inc.
Liu Y, Gadepalli K, Norouzi M, Dahl GE, Kohlberger T, Boyko A, Venugopalan S, Timofeev A, Nelson PQ, Corrado GS, Hipp JD, Peng L, Stumpe MC. Detecting Cancer Metastases on Gigapixel Pathology Images. eprint arXiv:1703.02442. Retrieved from: https://arxiv.org/abs/1703.02442
Mukherjee S (2017). A.I. Versus M.D. What happens when diagnosis is automated? Retrieved from https://www.newyorker.com/magazine/2017/04/03/ai-versus-md
Stumpe M and Peng L (2017). Assisting Pathologists in Detecting Cancer with Deep Learning. Retrieved from https://research.googleblog.com/2017/03/assisting-pathologists-in-detecting.html
Svensson CM, Hubler R, Figge MT (2015). Automated Classification of Circulating Tumor Cells and the Impact of Interobserver Variability on Classifier Training and Performance. J Immunol Res 2015:573165.
Wilson PWF, D'Agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB (1998). Prediction of Coronary Heart Disease Using Risk Factor Categories. Circulation;97:1837-1847. DOI: https://doi.org/10.1161/01.CIR.97.18.1837. Retrieved from: http://circ.ahajournals.org/content/97/18/1837.long
Submitted by Thomas Frohwein