Machine Learning for Medical Imaging - PubMed (original) (raw)

Review

. 2017 Mar-Apr;37(2):505-515.

doi: 10.1148/rg.2017160130. Epub 2017 Feb 17.

Affiliations

Review

Machine Learning for Medical Imaging

Bradley J Erickson et al. Radiographics. 2017 Mar-Apr.

Abstract

Machine learning is a technique for recognizing patterns that can be applied to medical images. Although it is a powerful tool that can help in rendering medical diagnoses, it can be misapplied. Machine learning typically begins with the machine learning algorithm system computing the image features that are believed to be of importance in making the prediction or diagnosis of interest. The machine learning algorithm system then identifies the best combination of these image features for classifying the image or computing some metric for the given image region. There are several methods that can be used, each with different strengths and weaknesses. There are open-source versions of most of these machine learning methods that make them easy to try and apply to images. Several metrics for measuring the performance of an algorithm exist; however, one must be aware of the possible associated pitfalls that can result in misleading metrics. More recently, deep learning has started to be used; this method has the benefit that it does not require image feature identification and calculation as a first step; rather, features are identified as part of the learning process. Machine learning has been used in medical imaging and will have a greater influence in the future. Those working in medical imaging must be aware of how machine learning works. ©RSNA, 2017.

PubMed Disclaimer

Figures

Figure 1.

Figure 1.

Machine learning model development and application model for medical image classification tasks. For training, the machine learning algorithm system uses a set of input images to identify the image properties that, when used, will result in the correct classification of the image—that is, depicting benign or malignant tumor—as compared with the supplied labels for these input images. (b) For predicting, once the system has learned how to classify images, the learned model is applied to new images to assist radiologists in identifying the tumor type.

Figure 2.

Figure 2.

Diagrams illustrate under- and overfitting. Underfitting occurs when the fit is too simple to explain the variance in the data and does not capture the pattern. An appropriate fit captures the pattern but is not too inflexible or flexible to fit data. Overfitting occurs when the fit is too good to be true and there is possibly fitting to the noise in the data. The axes are generically labeled feature 1 and feature 2 to reflect the first two elements of the feature vector.

Figure 3.

Figure 3.

Example of a neural network. In this case, the input values (×_1, ×_2, ×3) are multiplied by a weight (w) and passed to the next layer of nodes. Although we show just a single weight, each such connection weight has a different numeric value, and it is these values that are updated as part of the learning process. Each node has an activation function (f) that computes its output (y) by using x and w as inputs. The last layer is the output layer. Those outputs are compared with the expected values (the training sample labels), and an error is calculated. The weight optimizer determines how to adjust the various weights in the network in order to achieve a lower error in the next iteration. Stochastic gradient descent (SGD) is one common way of updating the weights of the network. The network is considered to have completed learning when there is no substantial improvement in the error over prior iterations.

Figure 4.

Figure 4.

Example of the _k_-nearest neighbors algorithm. The unknown object (?) would be assigned to the ◆ class on the basis of the nearest neighbor (k = 1), but it would be assigned to the × class if k were equal to 3, because two of the three closest neighbors are × class objects. Values plotted on the x and y axes are those for the two-element feature vector describing the example objects.

Figure 5.

Figure 5.

Example shows two classes (●, ○) that cannot be separated by using a linear function (left diagram). However, by applying a nonlinear function f(x), one can map the classes to a space where a plane can separate them (right diagram). This example is two dimensional, but support vector machines can have any dimensionality required. These machines generally are “well behaved,” meaning that for new examples that are similar, the classifier usually yields reasonable results. When the machine learning algorithm is successful, the two classes will be perfectly separated by the plane. In the real world, perfect separation is not possible, but the optimal plane that minimizes misclassifications can be found.

Similar articles

Cited by

References

    1. From 600Mto600 M to 600Mto6 billion, artificial intelligence systems poised for dramatic market expansion in healthcare. Frost & Sullivan website. http://ww2.frost.com/news/press-releases/600-m-6-billion-artificial-inte.... Accessed September 2, 2016.
    1. Schoepf UJ, Costello P. CT angiography for diagnosis of pulmonary embolism: state of the art. Radiology 2004;230(2):329–337. - PubMed
    1. Schoepf UJ, Schneider AC, Das M, Wood SA, Cheema JI, Costello P. Pulmonary embolism: computer-aided detection at multidetector row spiral computed tomography. J Thorac Imaging 2007;22(4):319–323. - PubMed
    1. Dundar MM, Fung G, Krishnapuram B, Rao RB. Multiple-instance learning algorithms for computer-aided detection. IEEE Trans Biomed Eng 2008;55(3):1015–1021. - PubMed
    1. Summers RM. Improving the accuracy of CTC interpretation: computer-aided detection. Gastrointest Endosc Clin N Am 2010;20(2):245–257. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources