Clinical Applications of Machine Learning for Diagnosis

From Clinfowiki
Jump to: navigation, search


Machine learning is a collection of computer algorithms and techniques that allow computers to learn from data (Kirk, 2017).


Machine learning
Use of a computerized algorithm to determine associations between factors and an outcome
Artificial Neural networks
A computing approach inspired by biological neural networks. It is a network of connected units, similar to neurons and synapses (Mukherjee, 2017; wikipedia:Artificial neural network).



Framingham Risk Score for coronary heart disease (Deo, 2015)

This is probably one of the oldest and/or best known examples of machine learning in medicine. It took the data from the original Framing cohort and the Framingham Offspring Study and performed statistical analysis, including linear regression, logistic regression, age-adjustment, to arrive at a scoring mechanism for the prediction of coronary heart disease (Wilson et al., 1998).


Diagnosis of pathological specimen is a skill that requires long training, and even then has at times low inter-observer agreement.

Google has tested their deep learning tool "Inception (aka GoogLeNet)" for the pathological detection of metastasized breast cancer. They trained a model that was able to perform at least as well as trained pathologists in most regards. A trade-off between increasing sensitivity and allowing false-positive results was observed (Stumpe and Peng, 2017, Liu et al., 2017).


Esteva et al. report results of training a neural network on skin lesions with their disease labels. The neural network achieved a similar performance to a comparison group of 21 dermatologists for classification of keratinocyte carcinomas versus benign seborrheic keratoses and malignant melanomas versus benign nevi (Esteva et al., 2017).


Gulshan et al. (2016) studied the performance of a neural network for identification of diabetic retinopathy. The findings were compared against the assessment by board-certified ophthalmologists. The algorithm achieved a sensitivity of 87-97.5%, and a specificity of 93.4-98.5%, depending on the configuration of the operating point (i.e. trade-off between sensitivity and specificity).


  • When machine learning is applied to known outcomes (supervised learning), there may be uncertainty and/or variation in the determination of those outcomes, even by specialists. One example is interobserver disagreement when classifying tumor cells (Svensson et al., 2015).
  • Some machine learning algorithms (e.g. artificial neural networks) do not allow understanding of the reason why certain features influence the prediction. This is commonly compared to a black box because no insight into the inner workings is possible that would provide a rationale for the results (Mukherjee, 2017).
  • Overfitting occurs when random noise (which is part of any training dataset) enters the prediction. It generally occurs when too many and/or non-relevant predictors are used in the training, and it endangers the generalizability of the results of machine learning. Overfitting is typically identified when the algorithm is applied to a new dataset (Cook and Ranstam, 2016).

See also


Cook JA, Ranstam J (2016). Overfitting. British Journal of Surgery;103(13):1814. DOI: 10.1002/bjs.10244

Deo, Rahul C (2015). Machine Learning in Medicine. Circulation;132:1920-1930. DOI: Retrieved from:

Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017). Nature;542:115-118. DOI: 10.1038/nature21056. Retrieved from:

Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, Venugopalan S, Widner K, Madams T, Cuadros J, Kim R, Raman R, Nelson PC, Mega JL, Webster DR (2016). Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs. JAMA;316(22):2402-2410. DOI: 10.1001/jama.2016.17216. Retrieved from:

Kirk, Matthew (2017). Thoughtful Machine Learning with Python. Sebastopol, CA: O'Reilly Media, Inc.

Liu Y, Gadepalli K, Norouzi M, Dahl GE, Kohlberger T, Boyko A, Venugopalan S, Timofeev A, Nelson PQ, Corrado GS, Hipp JD, Peng L, Stumpe MC. Detecting Cancer Metastases on Gigapixel Pathology Images. eprint arXiv:1703.02442. Retrieved from:

Mukherjee S (2017). A.I. Versus M.D. What happens when diagnosis is automated? Retrieved from

Stumpe M and Peng L (2017). Assisting Pathologists in Detecting Cancer with Deep Learning. Retrieved from

Svensson CM, Hubler R, Figge MT (2015). Automated Classification of Circulating Tumor Cells and the Impact of Interobserver Variability on Classifier Training and Performance. J Immunol Res 2015:573165.

Wilson PWF, D'Agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB (1998). Prediction of Coronary Heart Disease Using Risk Factor Categories. Circulation;97:1837-1847. DOI: Retrieved from:

Submitted by Thomas Frohwein