Annual Reviews home
0
Skip to content
  • For Librarians & Agents
  • For Authors
  • Knowable Magazine
  • Institutional Login
  • Login
  • Register
  • Activate
  • 0 Cart
  • Help
Annual Reviews home
  • JOURNALS A-Z
    • Analytical Chemistry
    • Animal Biosciences
    • Anthropology
    • Astronomy and Astrophysics
    • Biochemistry
    • Biomedical Data Science
    • Biomedical Engineering
    • Biophysics
    • Cancer Biology
    • Cell and Developmental Biology
    • Chemical and Biomolecular Engineering
    • Clinical Psychology
    • Computer Science
    • Condensed Matter Physics
    • Control, Robotics, and Autonomous Systems
    • Criminology
    • Developmental Psychology
    • Earth and Planetary Sciences
    • Ecology, Evolution, and Systematics
    • Economics
    • Entomology
    • Environment and Resources
    • Financial Economics
    • Fluid Mechanics
    • Food Science and Technology
    • Genetics
    • Genomics and Human Genetics
    • Immunology
    • Law and Social Science
    • Linguistics
    • Marine Science
    • Materials Research
    • Medicine
    • Microbiology
    • Neuroscience
    • Nuclear and Particle Science
    • Nutrition
    • Organizational Psychology and Organizational Behavior
    • Pathology: Mechanisms of Disease
    • Pharmacology and Toxicology
    • Physical Chemistry
    • Physiology
    • Phytopathology
    • Plant Biology
    • Political Science
    • Psychology
    • Public Health
    • Resource Economics
    • Sociology
    • Statistics and Its Application
    • Virology
    • Vision Science
    • Article Collections
    • Events
    • Shot of Science
  • JOURNAL INFO
    • Copyright & Permissions
    • Add To Your Course Reader
    • Expected Publication Dates
    • Impact Factor Rankings
    • Access Metadata
    • RSS Feeds
  • PRICING & SUBSCRIPTIONS
    • General Ordering Info
    • Online Activation Instructions
    • Personal Pricing
    • Institutional Pricing
    • Society Partnerships
  •     S2O    
  •     GIVE    
  • ABOUT
    • What We Do
    • Founder & History
    • Our Team
    • Careers
    • Press Center
    • Events
    • News
    • Global Access
    • DEI
    • Directory
    • Help/FAQs
    • Contact Us
  • Home >
  • Annual Review of Biomedical Engineering >
  • Volume 19, 2017 >
  • Shen, pp 221-248
  • Save
  • Email
  • Share

Deep Learning in Medical Image Analysis

  • Home
  • Annual Review of Biomedical Engineering
  • Volume 19, 2017
  • Shen, pp 221-248
  • Facebook
  • Twitter
  • LinkedIn
Download PDF

Deep Learning in Medical Image Analysis

Annual Review of Biomedical Engineering

Vol. 19:221-248 (Volume publication date June 2017)
First published as a Review in Advance on March 9, 2017
https://doi.org/10.1146/annurev-bioeng-071516-044442

Dinggang Shen,1,2 Guorong Wu,1 and Heung-Il Suk2

1Department of Radiology, University of North Carolina, Chapel Hill, North Carolina 27599; email: [email protected]

2Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea; email: [email protected]

Download PDF Article Metrics
  • Permissions
  • Reprints

  • Download Citation
  • Citation Alerts
Sections
  • Abstract
  • Keywords
  • INTRODUCTION
  • DEEP LEARNING
  • APPLICATIONS IN MEDICAL IMAGING
  • CONCLUSION
  • disclosure statement
  • acknowledgments
  • literature cited

Abstract

This review covers computer-assisted analysis of images in the field of medical imaging. Recent advances in machine learning, especially with regard to deep learning, are helping to identify, classify, and quantify patterns in medical images. At the core of these advances is the ability to exploit hierarchical feature representations learned solely from data, instead of features designed by hand according to domain-specific knowledge. Deep learning is rapidly becoming the state of the art, leading to enhanced performance in various medical applications. We introduce the fundamentals of deep learning methods and review their successes in image registration, detection of anatomical and cellular structures, tissue segmentation, computer-aided disease diagnosis and prognosis, and so on. We conclude by discussing research issues and suggesting future directions for further improvement.

Keywords

medical image analysis, deep learning, unsupervised feature learning

1. INTRODUCTION

Over the past few decades, medical imaging techniques, such as computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET), mammography, ultrasound, and X-ray, have been used for the early detection, diagnosis, and treatment of diseases (1). In the clinic, medical image interpretation has been performed mostly by human experts such as radiologists and physicians. However, given wide variations in pathology and the potential fatigue of human experts, researchers and doctors have begun to benefit from computer-assisted interventions. Although the rate of progress in computational medical image analysis has not been as rapid as that in medical imaging technologies, the situation is improving with the introduction of machine learning techniques.

In applying machine learning, finding or learning informative features that well describe the regularities or patterns inherent in data plays a pivotal role in various tasks in medical image analysis. Conventionally, meaningful or task-related features were designed mostly by human experts on the basis of their knowledge about the target domains, making it challenging for nonexperts to exploit machine learning techniques for their own studies. In the meantime, there have been efforts to learn sparse representations based on predefined dictionaries, possibly learned from training samples. Sparse representation is motivated by the principle of parsimony in many areas of science; that is, the simplest explanation of a given observation should be preferred over more complicated ones. Sparsity-inducing penalization and dictionary learning have demonstrated the validity of this approach for feature representation and feature selection in medical image analysis (2–6). It should be noted that sparse representation or dictionary learning methods described in the literature still find informative patterns or regularities inherent in data with a shallow architecture, thus limiting their representational power. However, deep learning (7) has overcome this obstacle by incorporating the feature engineering step into a learning step. That is, instead of extracting features manually, deep learning requires only a set of data with minor preprocessing, if necessary, and then discovers the informative representations in a self-taught manner (8, 9). Therefore, the burden of feature engineering has shifted from humans to computers, allowing nonexperts in machine learning to effectively use deep learning for their own research and/or applications, especially in medical image analysis.

The unprecedented success of deep learning is due mostly to the following factors: (a) advances in high-tech central processing units (CPUs) and graphics processing units (GPUs), (b) the availability of a huge amount of data (i.e., big data), and (c) developments in learning algorithms (10–14). Technically, deep learning can be regarded as an improvement over conventional artificial neural networks (15) in that it enables the construction of networks with multiple (more than two) layers. Deep neural networks can discover hierarchical feature representations such that higher-level features can be derived from lower-level features (9). Because these techniques enable hierarchical feature representations to be learned solely from data, deep learning has achieved record-breaking performance in a variety of artificial intelligence applications (16–23) and grand challenges (24, 25; see https://grand-challenge.org). In particular, improvements in computer vision prompted the use of deep learning in medical image analysis, such as image segmentation (26, 27), image registration (28), image fusion (29), image annotation (30), computer-aided diagnosis (CADx) and prognosis (31–33), lesion/landmark detection (34–36), and microscopic image analysis (37, 38).

Deep learning methods are highly effective when the number of available samples during the training stage is large. For example, in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), more than one million annotated images were available (24). However, in most medical applications there are far fewer images (i.e., <1,000). Therefore, a primary challenge in applying deep learning to medical images is the limited number of training samples available to build deep models without suffering from overfitting. To overcome this challenge, research groups have devised various strategies, such as (a) taking either two-dimensional (2D) or three-dimensional (3D) image patches, rather than the full-sized images, as input (29, 39–45) in order to reduce input dimensionality and thus the number of model parameters; (b) expanding the data set by artificially generating samples via affine transformation (i.e., data augmentation), and then training their network from scratch with the augmented data set (39–42); (c) using deep models trained on a huge number of natural images in computer vision as “off-the-shelf” feature extractors, and then training the final classifier or output layer with the target-task samples (43, 45); (d) initializing model parameters with those of pretrained models from nonmedical or natural images, then fine-tuning the network parameters with the task-related samples (46, 47); and (e) using models trained with small-sized inputs for arbitrarily sized inputs by transforming weights in the fully connected layers into convolutional kernels (36, 48).

In terms of input types, we can categorize deep models as typical multilayer neural networks that take vector-format (i.e., nonstructured) values as input and convolutional networks that take 2D or 3D (i.e., structured) values as input. Because of the structural characteristics of images (the structural or configural information contained in neighboring pixels or voxels is another important source of information), convolutional neural networks (CNNs) have attracted great interest in the field of medical image analysis (26, 35–37, 48–50). However, networks with vectorized inputs have also been successfully used in different medical applications (28, 29, 31, 33, 51–54). Along with deep neural networks, deep generative models (55)—such as deep belief networks (DBNs) and deep Boltzmann machines (DBMs), which are probabilistic graphical models with multiple layers of hidden variables—have been successfully applied to brain disease diagnosis (29, 33, 47, 56), lesion segmentation (36, 49, 57, 58), cell segmentation (37, 38, 59, 60), image parsing (61–63), and tissue classification (26, 35, 48, 50).

This review is organized as follows. In Section 2, we explain the computational theories of neural networks and deep models [e.g., stacked auto-encoders (SAEs), DBNs, DBMs, CNNs] and discuss how they extract high-level representations from data. In Section 3, we introduce recent studies using deep models for different applications in medical imaging, including image registration, anatomy localization, lesion segmentation, detection of objects and cells, tissue segmentation, and computer-aided detection (CADe) and CADx. Finally, in Section 4 we conclude by summarizing research trends and suggesting directions for further improvements.

2. DEEP LEARNING

In this section, we explain the fundamental concepts of feed-forward neural networks and basic deep models in the literature. We focus on learning hierarchical feature representations from data. We also discuss how to efficiently learn parameters of deep architecture by reducing overfitting.

2.1. Feed-Forward Neural Networks

In machine learning, artificial neural networks are a family of models that mimic the structural elegance of the neural system and learn patterns inherent in observations. The perceptron (64) is the earliest trainable neural network with a single-layer architecture,1 composed of an input layer and an output layer. A perceptron, or a modified perceptron with multiple output units (Figure 1a), is regarded as a linear model, prohibiting its application in tasks involving complicated data patterns, despite the use of nonlinear activation functions in the output layer.

figure
Figure 1 

This limitation can be overcome by introducing a so-called hidden layer between the input layer and the output layer. Note that in neural networks the units of the neighboring layers are fully connected to one another, but there are no connections among units in the same layer. For a two-layer neural network (Figure 1b), also known as a multilayer perceptron, given an input vector , we can write the estimation function of an output unit yk as a composition function as follows:

(1)
equation (1)

where the superscript denotes a layer index, f(1)(·) and f(2)(·) denote nonlinear activation functions of units at the specified layers, M is the number of hidden units, and Θ={W(1), W(2), b(1), b(2)} is a parameter set.2 Conventionally, the hidden units’ activation function, f(1)(·), is commonly defined with a sigmoidal function such as a logistic sigmoid function or a hyperbolic tangent function, whereas the output units’ activation function f(2)(·) is dependent on the target task. Because the estimation proceeds in a forward direction, this type of network is also referred to as a feed-forward neural network.

When the hidden layer in Equation 1 is regarded as a feature extractor, from an input v, the output layer is only a simple linear model,

(2)
equation (2)

where ϕj(v)≡f(1)(∑Di=1W(1)jivi+b(1)j). The same interpretation holds when there is a higher number of hidden layers. Thus, it is intuitive that the role of hidden layers is to find features that are informative for the target task.

The practical use of neural networks requires that the model parameters Θ be learned from data. The problem of parameter learning can be formulated as the minimization of the error function. From an optimization perspective, the error function E for neural networks is highly nonlinear and nonconvex. Thus, there is no analytic solution of the parameter set Θ. Instead, one can use a gradient descent algorithm by updating the parameters iteratively. In order to utilize a gradient descent algorithm, there must be a way to compute a gradient ∇E(Θ) evaluated at the parameter set Θ.

For a feed-forward neural network, the gradient can be efficiently evaluated by means of error back-propagation (65). Once the gradient vector of all the layers is known, the parameters Θ ∈{W(1), W(2), b(1), b(2)} can be updated as follows:

(3)
equation (3)

where η is a learning rate and τ denotes an iteration index. The update process is repeated until convergence or until the predefined number of iterations is reached. As for the parameter update in Equation 3, the stochastic gradient descent with a small subset of training samples, termed a minibatch, is commonly used in the literature (66).

2.2. Deep Models

Under a mild assumption on the activation function, a two-layer neural network with a finite number of hidden units can approximate any continuous function (67); therefore, it is regarded as a universal approximator. However, it is also possible to approximate complex functions to the same accuracy by using a deep architecture (i.e., one with more than two layers), with a far fewer number of units (8). Thus, it is possible to reduce the number of trainable parameters, enabling training with a relatively small data set (68).

2.3. Unsupervised Feature Representation Learning

Compared with shallow architectures that require a good feature extractor designed mostly by hand on the basis of expert knowledge, deep models are useful for discovering informative features from data in a hierarchical manner (i.e., from fine to abstract). Here, we introduce three deep models that are widely used in different applications for unsupervised feature representation learning.

2.3.1. Stacked auto-encoder.
An auto-encoder or auto-associator (69) is a special type of two-layer neural network that learns a latent or compressed representation of the input by minimizing the reconstruction error between the input and output values of the network, namely the reconstruction of the input from the learned representations. Because of its simple, shallow structure, a single-layer auto-encoder's representational power is very limited. But when multiple auto-encoders are stacked (Figure 2a) in a configuration called an SAE, one can significantly improve the representational power by using the activation values of the hidden units of one auto-encoder as the input to the next higher auto-encoder (70). One of the most important characteristics of SAEs is their ability to learn or discover highly nonlinear and complicated patterns, such as the relations among input values. When an input vector is presented to an SAE, the different layers of the network represent different levels of information. That is, the lower the layer in the network is, the simpler the patterns are, and the higher the layer is, the more complicated or abstract the patterns inherent in the input vector are.
figure
Figure 2 

With regard to training parameters of the weight matrices and the biases in SAE, a straightforward approach is to apply back-propagation to the gradient-based optimization technique, beginning from random initialization by using the SAE as a conventional feed-forward neural network. Unfortunately, deep networks trained in this manner perform worse than networks with a shallow architecture, as they fall into a poor local optimum (71). To circumvent this problem, one should consider greedy layer-wise learning (10, 72). The key idea of greedy layer-wise learning is to pretrain one layer at a time. That is, the user trains parameters of the first hidden layer with the training data as input, and then trains parameters of the second hidden layer with the output from the first hidden layer as input, and so on. In other words, the representation of the lth hidden layer is used as input for the (l+1)-th hidden layer. An important advantage of such a pretraining technique is that it is conducted in an unsupervised manner with a standard back-propagation algorithm, enabling the user to increase the size of the data set by exploiting unlabeled samples for training.

2.3.2. Deep belief network.
A restricted Boltzmann machine (RBM) (73) is a single-layer undirected graphical model with a visible layer and a hidden layer. It assumes symmetric connectivities between visible and hidden layers, but no connections among units within the same layer. Because of the symmetry of the connectivities, it can generate input observations from hidden representations. Therefore, an RBM naturally becomes an auto-encoder (10, 73), and its parameters are usually trained by use of a contrastive divergence algorithm (74) so as to maximize the log likelihood of observations. Like SAEs, RBMs can be stacked in order to construct a deep architecture, resulting in a single probabilistic model called a DBN. A DBN has one visible layer v and a series of hidden layers h(1), …, h(L) (Figure 2b). Note that when multiple RBMs are stacked hierarchically, although the top two layers still form an undirected generative model (i.e., an RBM), the lower layers form directed generative models. Thus, the joint distribution of the observed units v and the l hidden layers h(l) (l=1, …, L) in DBN is

(4)
equation (4)

where P(h(l)|h(l+1)) corresponds to a conditional distribution for the units of layer l given the units of layer l+1, and P(h(L−1), h(L)) denotes the joint distribution of the units in layers L−1 and l.

Regarding the learning of parameters, the greedy layer-wise pretraining scheme (10) can be applied in the following steps.

1.

Train the first layer as an RBM with v=h(0).

2.

Use the first hidden layer to obtain the representation of inputs with either the mean activations of P(h(1)=1|h(0)) or samples drawn according to P(h(1)|h(0)), which will be used as observations for the second hidden layer.

3.

Train the second hidden layer as an RBM, taking the transformed data (mean activations or samples) as training examples (for the visible layer of the RBM).

4.

Iterate steps 2 and 3 for the desired number of layers, each time propagating upward either mean activations P(h(l+1)=1|h(l)) or samples drawn according to the conditional probability P(h(l+1)|h(l)).

After the greedy layer-wise training procedure is complete, one can apply the wake–sleep algorithm (75) to further increase the log likelihood of the observations. Usually, however, no further procedure is conducted to train the whole DBN jointly in practice.

2.3.3. Deep Boltzmann machine.
A DBM (55) is also constructed by stacking multiple RBMs in a hierarchical manner. However, in contrast to DBNs, all the layers in DBMs form an undirected generative model following the stacking of RBMs (Figure 2c). Thus, for hidden layer l, except in the case of l=1 and l=L, the layer's probability distribution is conditioned by its two neighboring layers, l+1 and L−1 [i.e., P(h(l)|h(l+1), h(l−1))]. The incorporation of information from both the upper and lower layers improves a DBM's representational power so that it is more robust to noisy observations.

Let us consider a three-layer DBM, namely the L=2 DBM shown in Figure 2c. Given the values of the units in the neighboring layer(s), the probability of either the binary visible or binary hidden units being set to one is computed as follows:

(5)
equation (5)

(6)
equation (6)

(7)
equation (7)

where σ(·) denotes a logistic sigmoid function. In order to learn the parameters Θ={W(1), W(2)},3 we maximize the log likelihood of the observations. The derivative of the log likelihood of the observations with respect to the model parameters takes the following simple form:

(8)
equation (8)

where denotes the data-dependent statistics obtained by sampling the model conditioned on the visible units v(=h(0)) and denotes the data-independent statistics obtained by sampling from the model. When the model approximates the data distribution well, data-dependent and data-independent statistics reach equilibrium.

2.4. Fine-Tuning Deep Models for Target Tasks

Note that, during feature representation learning for the three deep models described above, the target values (either discrete labels or continuous real values of observations) are never involved. Therefore, there is no guarantee that the representations learned by SAEs, DBNs, or DBMs are discriminative for a classification task, for example. To address this problem, the so-called fine-tuning step is generally followed after unsupervised feature representation learning.

For a certain task involving either classification or regression, it is straightforward to convert feature representation learning models into a deep neural network by stacking another output layer on top of the highest hidden layer in an SAE, DBN, or DBM with an appropriate output function. In the case of a DBM, the original input vector should first be augmented with the marginals of the approximate posterior of the second hidden layer as a by-product when converting a DBM into a deep neural network (55). The top output layer is then used to predict the target value(s) of an input. To fine-tune the parameters in a deep neural network, we first take the pretrained connection weights of the hidden layers as the initial values, randomly initialize the connection weights between the top hidden layer and the output layer, and then train the parameters jointly in a supervised (i.e., end-to-end) manner by gradient descent with a back-propagation algorithm. Initialization of the parameters via pretraining helps the supervised optimization reduce the risk of falling into poor local optima (10, 71).

2.5. Convolutional Neural Networks

In the deep models of SAEs, DBNs, and DBMs, described above, the inputs are always in vector form. However, for (medical) images, the structural information among neighboring pixels or voxels is also important, but vectorization inevitably destroys such structural and configural information in images. CNNs (76) are designed to better utilize spatial and configural information by taking 2D or 3D images as input. Structurally, CNNs have convolutional layers interspersed with pooling layers, followed by fully connected layers as in a standard multilayer neural network. Unlike a deep neural network, a CNN exploits three mechanisms—a local receptive field, weight sharing, and subsampling (Figure 3)—that greatly reduce the degrees of freedom in a model.

figure
Figure 3 

The role of a convolutional layer is to detect local features at different positions in the input feature maps with learnable kernels k(l)ij, namely connection weights between the feature map i at layer L−1 and the feature map j at layer l. Specifically, the units of the convolutional layer l compute their activation A(l)j on the basis of only a spatially contiguous subset of units in the feature maps A(l−1)i of the preceding layer L−1 by convolving the kernels k(l)ij as follows:

(9)
equation (9)

where M(l−1) denotes the number of feature maps in layer L−1, the asterisk denotes a convolutional operator, b(l)j is a bias parameter, and f(·) is a nonlinear activation function. Due to the mechanisms of weight sharing and local receptive field, when the input feature map is slightly shifted, the activation of the units in the feature maps is shifted by the same amount.

A pooling layer follows a convolutional layer to downsample the feature maps of the preceding convolutional layer. Specifically, each feature map in a pooling layer is linked to a feature map in the convolutional layer; each unit in a feature map of the pooling layer is computed on the basis of a subset of units within a local receptive field from the corresponding convolutional feature map. Similar to the convolutional layer, the receptive field finds a representative value (e.g., maximum or average) among the units in its field. Usually, a change in the size of the receptive field during convolution is set equal to the size of the receptive field for subsampling, helping the CNN to be translation invariant.

Theoretically, the gradient descent method combined with a back-propagation algorithm can also be applied to the learning parameters of a CNN. However, due to the special mechanisms of weight sharing, the local receptive field, and pooling, slight changes need to be made; that is, one needs to sum the gradients for a given weight over all the connections using the kernel weights in order to determine which patch in the layer's feature map corresponds to a unit in the next layer's feature map, and to upsample the feature maps of the pooling layer to recover the reduced size of the maps.

2.6. Reducing Overfitting

A critical challenge in training deep models arises from the limited number of training samples compared with the number of learnable parameters. Thus, reducing overfitting has long presented a challenge. Recent studies have devised algorithmic techniques to better train deep models. Some of the techniques are as follows.

1.

Initialization/momentum (77, 78) involves the use of well-designed random initialization and a particular schedule of slowly increasing the momentum parameter as iteration passes.

2.

Rectified linear unit (ReLU) (12, 79, 80) applies to nonlinear activation functions.

3.

Denoising (11) involves stacking layers of denoising auto-encoders, which are trained locally to reconstruct the original “clean” inputs from the corrupted versions of them.

4.

Dropout (13) and DropConnect (81) randomly deactivate a fraction (e.g., 50%) of the units or connections in a network on each training iteration.

5.

Batch normalization (14) performs normalization for each minibatch and back-propagates the gradients through the normalization parameters.

See the references cited for further details.

3. APPLICATIONS IN MEDICAL IMAGING

Compared with other machine learning techniques in the literature, deep learning has witnessed significant advances. These successes have prompted researchers in the field of computational medical imaging to investigate the potential of deep learning in medical images acquired with, for example, CT, MRI, PET, and X-ray. In this section, we discuss the practical applications of deep learning in image registration and localization, detection of anatomical and cellular structures, tissue segmentation, and computer-aided disease prognosis and diagnosis.

3.1. Deep Feature Representation Learning in Medical Images

Many existing medical image processing methods rely on morphological feature representations to identify local anatomical characteristics. However, such feature representations were designed mostly by human experts, and the image features are often problem specific and not guaranteed to work for other image types. For instance, image segmentation and registration methods designed for 1.5-T T1-weighted brain MR images are not applicable to 7.0-T T1-weighted MR images (28, 52), not to mention to other modalities or different organs. Furthermore, 7.0-T MR images can reveal the brain's anatomy with a resolution equivalent to that obtained from thin slices in vitro (82). Thus, researchers can clearly observe fine brain structures at the micrometer scale, which previously was possible only with in vitro imaging. However, the lack of efficient computational tools substantially hinders the translation of new imaging techniques into the medical imaging arena.

Although state-of-the-art methods use supervised learning to find the most relevant and essential features for target tasks, they require a significant number of manually labeled training data, and the learned features may be superficial and may misrepresent the complexity of the anatomical structures. More critically, the learning procedure is often confined to a particular template domain, with a certain number of predesigned features. Therefore, once the template or image features change, the entire training process has to start over again. To address these limitations, Wu et al. (28, 52) developed a general feature representation framework that can (a) capture the intrinsic characteristics of anatomical structures necessary for accurate brain region segmentation and correspondence detection and (b) be flexibly applied to different kinds of medical images. Specifically, these authors used an SAE with a sparsity constraint, which they therefore termed a sparse auto-encoder, to hierarchically learn feature representations in a layer-by-layer manner. Their SAE model consisted of encoding and decoding modules hierarchically (Figure 4). In the encoding module, given an input image patch x, the model mapped the input to an activation vector y(1) through nonlinear deterministic mapping. The authors then repeated this procedure by using y(1) as the input to train the second layer, and so forth, until they obtained high-level feature representations (Figure 4). The decoding module was used to validate the expressive power of the learned feature representations by minimizing the reconstruction errors between the input image patch x and the reconstructed patch z after decoding.

figure
Figure 4 

Figure 5 demonstrates the power of feature representations learned by deep learning methods. Figure 5a–c shows a typical image registration result for brain images of an elderly patient, and Figure 5d–f compares different feature representations for finding a correspondence of a template point. Clearly, the deformed subject image in Figure 5c is far from being well registered with the template image in Figure 5a, especially for ventricles. It is very difficult to learn meaningful features given such inaccurate correspondences derived from imperfect image registration, a problem that many supervised learning methods suffer from (83–85). Moreover, the features [e.g., local patches and scale-invariant feature transform (SIFT) (86)] either detect too many noncorresponding points when using the entire intensity patch as the feature vector (Figure 5d) or have too-low responses and thus miss the correspondence when using SIFT (Figure 5e). Meanwhile, SAE-learned feature representations present the least confusing correspondence information for the subject point under consideration, making it easy to locate the correspondence of the template point in the subject image domain.

figure
Figure 5 

In order to qualitatively evaluate the registration accuracy, Wu et al. obtained deformable image registration results over various public data sets (Figure 6). Compared with the state-of-the-art registration methods of intensity-based diffeomorphic Demons (87) and feature-based HAMMER (88) for 1.5- and 3.0-T MR images, the SAE-learned feature representation depicted in Figure 6e performs better.

figure
Figure 6 

Another successful medical application involves localizing a prostate from MR images (89, 90). Accurate prostate localization in MR images is difficult for two reasons: (a) The appearance patterns around the prostate boundary vary significantly between patients, and (b) the intensity distributions vary between patients and often do not follow a Gaussian distribution. To address these challenges, Guo et al. (90) used an SAE to learn hierarchical feature representations from MR prostate images. The resulting learned features were integrated into a sparse patch-matching framework to find the corresponding patches in the atlas images for label propagation (91). Finally, a deformable model was employed to segment the prostate by combining the shape prior with the prostate likelihood map derived from sparse patch matching. Figure 7 shows typical prostate segmentation results from different patients, produced by three different feature representations.

figure
Figure 7 

The applications described above demonstrate that (a) the latent feature representations inferred by deep learning can successfully describe local image characteristics; (b) researchers can rapidly develop image analysis methods for new medical imaging modalities by using a deep learning framework to learn the intrinsic feature representations; and (c) the entire learning-based framework can be adapted to learn imaging feature representations and extended to various medical imaging applications, such as hippocampus segmentation (92) and prostate localization in MR images (89, 90).

3.2. Deep Learning for Detection of Anatomical Structures

Localization and interpolation of anatomical structures in medical images are key steps in the radiological workflow. Radiologists usually accomplish these tasks by identifying certain anatomical signatures, namely image features that can distinguish one anatomical structure from others. Is it possible for a computer to automatically learn such anatomical signatures? The success of such methods essentially depends on how many anatomy signatures can be extracted by computational operations. Whereas early studies often created specific image filters to extract anatomical signatures, more recent research has revealed that deep learning–based approaches have become prevalent for two reasons: (a) Deep learning technologies are now mature enough to solve real-world problems, and (b) more and more medical image data sets have become available to facilitate the exploration of big medical image data.

3.2.1. Detection of organs and body parts.
Shin et al. (51) used SAEs to separately learn both visual and temporal features in order to detect multiple organs in a time series of 3D dynamic contrast–enhanced MRI scans over data sets from two studies of liver metastases and one study of kidney metastases. Unlike conventional SAEs, the SAE in this study involved the application of a pooling operation after each layer so that features of progressively larger input regions were essentially compressed. Because different organ classes have different properties, the authors trained multiple models to separate each organ from all of the other organs in a supervised manner.

Roth et al. (93) presented a method for organ- or body part–specific anatomical classification of medical images using deep convolutional networks. Specifically, they trained their deep network by using 4,298 axial 2D CT images to learn five parts of the body: neck, lungs, liver, pelvis, and legs. Their experiments achieved an anatomy-specific classification error of 5.9% and an average AUC (area under the receiver-operating characteristic curve) value of 0.998. However, real-world applications may require more finely grained differentiation than that used for only five body parts (e.g., they may need to distinguish aortic arch from cardiac sections). To address this limitation, Yan et al. (94, 95) designed a multistate deep learning framework with a CNN to identify the body part of a transversal slice. Because each slice may contain multiple organs (enclosed in bounding boxes), the CNN was trained in a multi-instance fashion (96) in which the objective function was adapted such that, as long as one organ was correctly labeled, the corresponding slice was considered correct. Therefore, the pretrained CNN was sensitive to the discriminative bounding boxes. On the basis of the pretrained CNN's responses, discriminative and noninformative bounding boxes were selected to further boost the representation power of the pretrained CNN. At run time, a sliding-window approach was employed to apply the boosted CNN to the subject image. Because the CNN had peaky responses only on discriminative bounding boxes, it essentially identified body parts by focusing on the most distinctive local information. Compared with global image context-based approaches, this local approach was more accurate and robust. These authors’ body part recognition method was tested on 12 body parts on 7,489 CT slices, collected from scans of 675 patients varying in age from 1 to 90 years. The entire data set was divided into three groups: 2,413 (225 patients) for training, 656 (56 patients) for validation, and 4,043 (394 patients) for testing.

3.2.2. Cell detection.
Digitized tissue histopathology has recently been employed for microscopic examination and automatic disease grading. A primary challenge in microscopic image analysis involves the need to analyze all individual cells for accurate diagnosis, given that the differentiation of most disease grades depends strongly on cell-level information. To address this challenge, researchers have employed deep CNNs to robustly and accurately detect and segment cells from histopathological images (37, 38, 53, 54, 97–99), which can significantly benefit cell-level analysis for cancer diagnosis.

In a pioneering study, Cireşan et al. (37) used a deep CNN to detect mitosis in breast cancer histology images. Their networks were trained to classify each pixel in the images from a patch centered on the pixel. Their method won the 2012 International Conference on Pattern Recognition (ICPR) Mitosis Detection Contest,4 outperforming other contestants by a significant margin.

Since then, different groups have used different deep learning methods for detection in histology images. For example, Xu et al. (54) used an SAE to detect cells on breast cancer histological images. To train their deep model, they utilized a denoising auto-encoder to improve robustness to outliers and noises. Su et al. (53) also used an SAE as well as sparse representation to detect and segment cells from microscopic images. Sirinukunwattana et al. (100) proposed a spatially constrained CNN (SC-CNN) to detect and classify nuclei in histopathology images. Specifically, they used an SC-CNN to estimate the likelihood of a pixel being the center of a nucleus, where pixels with high probability values were spatially constrained to locate in the vicinity of the center of nuclei. They also developed a neighboring ensemble predictor coupled with a CNN to more accurately predict the class label of the detected cell nuclei. Chen et al. (38) designed a deep cascaded CNN by exploiting the technique of the full CNN, which replaces the fully connected layers with all-convolutional kernels (101). They first trained a coarse retrieval model to identify and locate mitosis candidates while maintaining high sensitivity. On the basis of the retrieved candidates, they then created a fine discrimination model by transferring deep and rich feature hierarchies learned on a large natural image data set to distinguish mitoses from hard mimics. Their cascaded CNN achieved the best detection accuracy in the 2014 ICPR MITOS-ATYPIA challenge.5

3.3. Deep Learning for Segmentation

Automatic segmentation of brain images is a prerequisite for quantitative assessment of the brain in patients of all ages. An important step in brain image preprocessing involves removing nonbrain regions such as the skull. Although current methods demonstrate good results on nonenhanced T1-weighted images, they still struggle when applied to other modalities and pathologically altered tissues. To circumvent such limitations, Kleesiek et al. (27) used 3D convolutional deep learning architecture for skull extraction, a technique that was not limited to nonenhanced T1-weighted MR images. While training their 3D CNN, they constructed minibatches of multiple cubes that were larger than the actual input to their 3D CNN for computational efficiency. Specifically, their deep model could take an arbitrary-sized 3D patch as input by building a fully convolutional network (101); thus, the output could be a block of predictions per input, rather than a single prediction as in a conventional CNN. Over four different data sets, their method achieved the highest average specificity measures in comparison to six commonly used tools (i.e., BET, BEaST, BSE, ROBEX, HWA, and 3dSkullStrip), whereas its sensitivity was about average.

Moeskops et al. (102) devised a multiscale CNN to enhance robustness in neonatal image segmentation and spatial consistency. Their network used multiple patch sizes and multiple convolution kernel sizes to acquire multiscale information about each voxel. Using this method, the authors obtained promising segmentation results for eight tissue types, with a Dice ratio6 averaging 0.82 to 0.91 over five different data sets.

The first year of life is the most dynamic phase of postnatal human brain development, characterized by rapid tissue growth and development of a wide range of cognitive and motor functions. Accurate tissue segmentation of infant brain MR images into white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF) in this phase is crucial in studies of normal and abnormal early brain development. Segmentation of infants’ brain MR images is considerably more difficult to perform than for adults because of the reduced tissue contrast (103), increased noise, severe partial volume effect (104), and ongoing WM myelination (103, 105). Specifically, the WM and GM exhibit almost the same intensity level (especially in the cortical regions), resulting in low image contrast. Although many methods have been proposed for infant brain image segmentation, most focus on segmentation of images of either neonates (∼3 months) or infants (>12 months) using a single T1-weighted or T2-weighted image (106–110). Few studies have addressed the challenges posed by segmentation of isointense-phase images (around 6 months old).

To overcome these difficulties, Zhang et al. (26) designed four CNN architectures to segment infant brain tissues on the basis of multimodal MR images. Specifically, each CNN contained three input feature maps corresponding to T1-weighted, T2-weighted, and fractional anisotropy (FA) image patches measuring 13×13 voxels. The authors applied to each CNN three convolutional layers and one fully connected layer, followed by an output layer with a softmax function for tissue classification. On a set of manually segmented isointense-phase brain images, these CNNs significantly outperformed competing methods.

More recently, Nie et al. (48) proposed the use of multiple fully convolutional networks (mFCNs) (Figure 8) to segment isointense-phase brain images with T1-weighted, T2-weighted, and FA modality information. Instead of simply combining three-modality data from the original (low-level) feature maps, they employed a deep architecture to effectively fuse high-level information from all three modalities. They assumed that high-level representations from different modalities were complementary to one another. First, the authors trained one network for each modality in order to effectively employ information from multiple modalities; second, they fused multiple-modality features from the high layer of each network (Figure 8). In these experiments, the mFCNs achieved average Dice ratios of 0.852 for CSF, 0.873 for GM, and 0.887 for WM from eight subjects, outperforming fully convolutional networks and other competing methods.

figure
Figure 8 

3.4. Deep Learning for Computer-Aided Detection

The goal of CADe is to find or localize abnormal or suspicious regions in structural images, and thus to alert clinicians. CADe aims to increase the detection rate of diseased regions while reducing the false-negative rate, which may be due to error or fatigue on the part of the observers. Although CADe is well established in medical imaging, deep learning methods have improved its performance in different clinical applications.

Typically, CADe occurs as follows: (a) The candidate regions are detected by means of image processing techniques; (b) the candidate regions are represented by a set of features, such as morphological or statistical information; and (c) the features are fed into a classifier, such as a support vector machine (SVM), to output a probability or make a decision as to whether disease is present. As explained in Section 1, human-designed feature representations can be incorporated into deep learning. Many groups have successfully used their own deep models in applications such as detection of pulmonary nodules, detection of lymph nodes, classification of interstitial lung disease in CT images, detection of cerebral microbleeds, and detection of multiple sclerosis lesions in MR images. Notably, most of the methods described in the literature exploited deep convolutional models to maximally utilize structural information in two, two-and-a-half, or three dimensions.

Ciompi et al. (43) used a pretrained OverFeat (111) out of the box as a feature extractor and empirically showed that a CNN learned from a completely different domain of natural images can provide useful feature descriptions for classification of pulmonary perifissural nodules. Roth et al. (40) focused on training deep models from scratch. To confront the problem of data insufficiency in training deep CNNs, they expanded their data set by scaling, translation, and rotation in random overtraining samples. They augmented the test samples in a similar way; obtained CNN outputs for every augmented test sample; and took the average of the outputs of the randomly transformed, scaled, and rotated patches for detection of lymph nodes and colonic polyps. To better utilize volumetric information in images, both Ciompi et al. (43) and Roth et al. (40) considered two-and-a-half-dimensional (2.5D) information with 2D patches of three orthogonal views (axial, sagittal, and coronal). Setio et al. (42) considered three sets of orthogonal views for a total of nine views from a 3D patch and used ensemble methods to fuse information from different views for detection of pulmonary nodules.

Gao et al. (112) focused on the holistic classification of CT patterns for interstitial lung disease by using a deep CNN. They borrowed the network architecture from Reference 113, with six units at the output layer, to classify patches into normal, emphysema, ground glass, fibrosis, micronodules, and consolidation. To overcome the overfitting problem, they utilized a data augmentation strategy by generating images by randomly jittering and cropping 10 subimages per original CT slice. At the testing stage, they generated 10 jittered images and fed them into the trained CNN. Finally, they predicted the input slice by aggregation, similar to the research by Roth et al. (40).

Shin et al. (45) conducted experiments on data sets of thoraco-abdominal lymph node detection and interstitial lung disease classification to explore how the performance of a CNN changes according to architecture, data set characteristics, and transfer learning. They considered five deep CNNs, namely CifarNet (114), AlexNet (113), OverFeat (111), VGG-16 (115), and GoogLeNet (116), which achieved state-of-the-art performance in various computer vision applications. From their extensive experiments, these authors drew some interesting conclusions: (a) It was consistently beneficial for CADe problems to transfer features learned from the large-scale annotated natural image data sets (ImageNet), and (b) applications of off-the-shelf deep CNN features to CADe problems could be improved by exploring the performance-complementary properties of human-designed features.

Unlike the studies above, which used deterministic deep architectures, van Tulder & de Bruijne (35) exploited a deep generative model with a convolutional RBM as the basic building block for classification of interstitial lung disease. Specifically, they used a discriminative RBM with an additional label layer along with input and hidden layers to improve the discriminative power of learned feature representations. These experiments demonstrated the advantages of combining generative and discriminative learning objectives by achieving higher performance than that of purely generative or discriminative learning methods.

Pereira et al. (34) investigated brain tumor segmentation by using CNNs in MR images. They explored small-sized kernels in order to have fewer parameters but deeper architectures. They trained different CNN architectures for low- and high-grade tumors and validated their method in the 2013 Brain Tumor Segmentation (BRATS) Challenge,7 where their technique ranked at the top for the complete, core, and enhancing regions for the challenge data set. Brosch et al. (49) applied deep learning for multiple sclerosis lesion segmentation on MR images. Their model was a 3D CNN composed of two interconnected pathways, namely a convolutional pathway that learned hierarchical feature representations similar to those of other CNNs and a deconvolutional pathway consisting of deconvolutional and unpooling layers with shortcut connections to the corresponding convolutional layers. The deconvolutional layers were designed to calculate abstract segmentation features from the features represented by each convolutional layer and the activation of the previous deconvolutional layer, if applicable. In comparison to five publicly available methods for multiple sclerosis lesion segmentation, this method achieved the best performance in terms of Dice similarity coefficient, absolution volume difference, and lesion false-positive rate.

An important limitation of typical deep CNNs arises from the fixed architecture of the models themselves. When an input observation is larger than the unit in the input layer, the straightforward solution is to apply a sliding-window strategy. However, it is computationally very expensive and time/memory consuming to do so. Because of this scalability issue in CNNs, Dou et al. (36) devised a 3D fully connected network by transforming units in the fully connected layers into a 3D (1×1×1) convolutionable kernel that enabled an arbitrary-sized input to be processed efficiently (101). The output of this 3D fully connected network could be remapped back onto the original input, making it possible to interpret the network output more intuitively. For detection of cerebral microbleeds in MR images, these authors designed a cascade framework. They first screened the input with the proposed 3D fully connected network to retrieve candidates with high probabilities of being cerebral microbleeds, and then applied a 3D CNN discrimination model for final detection. These experiments validated the effectiveness of the method by removing massive redundant computations and dramatically speeding up the detection process.

3.5. Deep Learning for Computer-Aided Diagnosis

CADx provides a second objective opinion regarding assessment of a disease from image-based information. The major applications of CADx involve the discrimination of malignant from benign lesions and the identification of certain diseases from one or more images. Conventionally, most CADx systems were developed to use human-designed features engineered by domain experts. Recently, deep learning methods have been successfully applied to CADx systems.

Cheng et al. (39) used an SAE with a denoising technique (SDAE) to differentiate breast ultrasound lesions and lung CT nodules. Specifically, the image regions of interest (ROIs) were first resized to 28×28, where all of the pixels in each patch were treated as the input to the SDAE. During the pretraining step, the authors corrupted the input patches with random noise to enhance the noise tolerance of their model. Later, during the fine-tuning step, they included the resized scale factors of the two ROI dimensions and the aspect ratios of the original ROIs to preserve the original information. Shen et al. (41) created a hierarchical learning framework with a multiscale CNN to capture various sizes of lung nodules. In this CNN architecture, three CNNs that took nodule patches from different scales as input were assembled in parallel. To reduce overfitting, the authors set the parameters of the three CNNs to be shared during training. The activations of the top hidden layer in three CNNs, one for each scale, were concatenated to form a feature vector. For classification, the authors used an SVM with a radial basis function kernel and random forest, which was trained to minimize so-called companion objectives, defined as the combination of overall hinge loss function and sum of the companion hinge loss functions (117).

Suk et al. (31) used an SAE to identify Alzheimer disease or mild cognitive impairment by fusing neuroimaging and biological features. They extracted GM volume features from MR images, regional mean intensity values from PET images, and three biological features (Aβ42, p-tau, and t-tau) from CSF. After training modality-specific SAEs, for each modality they constructed an augmented feature vector by concatenating the original features with the outputs of the top hidden layer of the respective SAEs. A multikernel SVM (118) was trained for clinical decision making. The same authors extended their research to find hierarchical feature representations by combining heterogeneous modalities during feature representation learning, rather than in the classifier learning step (29). Specifically, they used a DBM to find a latent hierarchical feature representation from a 3D patch, and then devised a systematic method for a joint feature representation (Figure 9a) from the paired patches of MRI and PET with a multimodal DBM. To enhance diagnostic performance, they also used a discriminative DBM by adding a discriminative RBM (119) on top of the highest hidden layer. That is, the top hidden layer was connected to both the lower hidden layer and the additional label layer that indicated the label of the input patches (Figure 9a). Using this method, the authors trained a multimodal DBM to discover hierarchical and discriminative feature representations by integrating the process of discovering features of inputs with their use in classification. Figure 9b,c shows the learned connection weights from the MRI pathway and the PET pathway.

figure
Figure 9 

Plis et al. (120) applied a DBN to MR images and validated the feasibility of the application by investigating whether a building block of deep generative models was competitive with independent component analysis, the most widely used method for functional MRI (fMRI) analysis. They also examined the effect of the depth of deep models in the analysis of structural MR images of a schizophrenia data set and a Huntington disease data set. Inspired by the work of Plis et al., Kim et al. (121) and Suk et al. (33) independently studied applications of deep learning for fMRI-based brain disease diagnosis. Kim et al. used an SAE for whole-brain resting-state functional connectivity pattern representation for the diagnosis of schizophrenia and the identification of aberrant functional connectivity patterns associated with schizophrenia. They first computed Pearson's correlation coefficients between pairs of 116 regions on the basis of their regional mean blood oxygenation level–dependent (BOLD) signals. After performing Fisher's r-to-z transformation on the coefficients and Gaussian normalization sequentially, they fed the pseudo-z-scored levels into their SAE. More recently, Suk et al. (33) proposed a novel framework of fusing deep learning with a hidden Markov model (HMM) for functional dynamics estimation in resting-state fMRI and successfully used this framework for the diagnosis of mild cognitive impairment (MCI). Specifically, they devised a deep auto-encoder (DAE) by stacking multiple RBMs in order to discover hierarchical nonlinear functional relations among brain regions. Figure 10 shows examples of the learned connection weights in the form of functional networks. This DAE was used to transform the regional mean BOLD signals into an embedding space, whose bases were understood as complex functional networks. After embedding functional signals, Suk et al. then used the HMM to estimate the dynamic characteristics of functional networks inherent in resting-state fMRI via internal states, which could be inferred statistically from observations. By building a generative model with an HMM, they estimated the likelihood of the input features of resting-state fMRI as belonging to the corresponding status (i.e., MCI or normal healthy control), then used this information to determine the clinical label of a test subject.

figure
Figure 10 

Other studies have used CNNs to diagnose brain disease. Brosch et al. (47) performed manifold learning from downsampled MR images by using a deep generative model composed of three convolutional RBMs and two RBM layers. To speed up the calculation of convolutions, the computational bottleneck of the training algorithm, they performed training in the frequency domain. By generating volume samples from their deep generative model, they validated the effectiveness of deep learning for manifold embedding with no explicitly defined similarity measure or proximity graph. Li et al. (44) constructed a three-layer CNN with two convolutional layers and one fully connected layer. They proposed to use CNNs to integrate multimodal neuroimaging data by designing a 3D CNN architecture that received one volumetric MRI patch as input and another volumetric PET patch as output. When trained end to end on subjects with both data modalities, the network captured the nonlinear relationship between two modalities. These experiments demonstrated that PET data could be predicted and estimated, given the input MRI data, and the authors quantitatively evaluated the proposed data completion method by comparing the classification results according to the predicted and actual PET images.

4. CONCLUSION

Computational modeling for medical image analysis has had a significant impact on both clinical applications and scientific research. Recent progress in deep learning has shed new light on medical image analysis by enabling the discovery of morphological and/or textural patterns in images solely from data. Deep learning methods have achieved state-of-the-art performance across different medical applications; however, there is still room for improvement. First, as witnessed in computer vision, in which breakthrough improvements were achieved by use of large numbers of training data [e.g., more than one million annotated images in ImageNet (24)], a large, publicly available data set of medical images from which deep models can find more generalized features would lead to improved performance. Second, although data-driven feature representations, especially in an unsupervised manner, have helped enhance accuracy, it would be desirable to devise a new methodological architecture involving domain-specific knowledge. Third, it is necessary to develop algorithmic techniques to efficiently handle images acquired with different scanning protocols so that it would not be necessary to train modality-specific deep models. Finally, when using deep learning to investigate underlying patterns in images such as fMRI, because of the black box–like characteristics of deep models, it remains challenging to understand and interpret the learned models intuitively.

disclosure statement

The authors are not aware of any affiliations, memberships, funding, or financial holdings that might be perceived as affecting the objectivity of this review.

acknowledgments

D.S. and H.-I.S. contributed equally to the writing of this review, and both are first and corresponding authors. The writing of this review was supported by an Institute for Information and Communications Technology Promotion grant funded by the Korean government [B0101-16-0307, Basic Software Research in Human-Level Lifelong Machine Learning (Machine Learning Center)]; by the Basic Science Research Program of the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (NRF-2015R1C1A1A01052216); and by the US National Institutes of Health (grants EB006733, EB008374, EB009634, MH100217, MH108914, AG041721, AG049371, AG042599, and DE022676).

literature cited

  • 1. 
    Brody H. 2013. Medical imaging. Nature 502:S81
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
  • 2. 
    Shao Y, Gao Y, Guo Y, Shi Y, Yang X, Shen D. 2014. Hierarchical lung field segmentation with joint shape and appearance sparse learning. IEEE Trans. Med. Imaging 33:1761–80
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
  • 3. 
    Wang L, Chen KC, Gao Y, Shi F, Liao S, et al. 2014. Automated bone segmentation from dental CBCT images using patch-based sparse representation and convex optimization. Med. Phys. 41:043503
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
  • 4. 
    Yap PH, Zhang Y, Shen D. 2016. Multi-tissue decomposition of diffusion MRI signals via L0 sparse-group estimation. IEEE Trans. Image Process. 25:4340–53
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
  • 5. 
    Suk HI, Lee SW, Shen D. 2016. Deep sparse multi-task learning for feature selection in Alzheimer's disease diagnosis. Brain Struct. Funct. 221:2569–87
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
  • 6. 
    Chen Y, Juttukonda M, Su Y, Benzinger T, Rubin BG, et al. 2015. Probabilistic air segmentation and sparse regression estimated pseudo CT for PET/MR attenuation correction. Radiology 275:562–69
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
  • 7. 
    Schmidhuber J. 2015. Deep learning in neural networks: an overview. Neural Netw. 61:85–117
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
    More AR articles citing this reference

    • AI in Measurement Science

      Chao Liu1,2 and Jiashu Sun1,21CAS Key Laboratory of Standardization and Measurement for Nanotechnology, CAS Center for Excellence in Nanoscience, National Center for Nanoscience and Technology, Beijing 100190, China; email: [email protected]2University of Chinese Academy of Sciences, Beijing 100049, China
      Annual Review of Analytical Chemistry Vol. 14: 1 - 19
      • ...support vector machine (SVM) (30), random forest (RF) (31), and artificial neural network (ANN) (32). ...
      • ...and the convolutional neural network (CNN) is widely used in image processing for diagnosis (32)....
    • Animal-in-the-Loop: Using Interactive Robotic Conspecifics to Study Social Behavior in Animal Groups

      Tim Landgraf,1 Gregor H.W. Gebhardt,1,2 David Bierbach,3,4 Pawel Romanczuk,5 Lea Musiolek,6 Verena V. Hafner,6 and Jens Krause3,41Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany; email: [email protected]2Computational Systems Neuroscience, Institute of Zoology, University of Cologne, 50674 Cologne, Germany3Department of Biology and Ecology of Fishes, Leibniz-Institute of Freshwater Ecology and Inland Fisheries, 12587 Berlin, Germany4Faculty of Life Sciences, Albrecht Daniel Thaer-Institute of Agricultural and Horticultural Sciences, Humboldt-Universität zu Berlin, 10099 Berlin, Germany5Institute for Theoretical Biology, Department of Biology, Humboldt-Universität zu Berlin, 10115 Berlin, Germany6Department of Computer Science, Humboldt-Universität zu Berlin, 10099 Berlin, Germany
      Annual Review of Control, Robotics, and Autonomous Systems Vol. 4: 487 - 507
      • ...Recent reinforcement learning approaches often use recurrent neural networks to model such adaptation mechanisms (97, 98)....
    • Identifying Regulatory Elements via Deep Learning

      Mira Barshai,1, Eitamar Tripto,2, and Yaron Orenstein11School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel; email: [email protected]2Department of Biomedical Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel
      Annual Review of Biomedical Data Science Vol. 3: 315 - 338
      • ...RNNs have been shown to outperform CNNs and other deep neural networks on sequential data (53)....
    • Data-Driven Approaches to Understanding Visual Neuron Activity

      Daniel A. ButtsDepartment of Biology and Program in Neuroscience and Cognitive Science, University of Maryland, College Park, Maryland 20742, USA; email: [email protected]
      Annual Review of Vision Science Vol. 5: 451 - 477
      • ...refer to solving tasks such as object and face recognition and have played a crucial role in driving the development of DNNs (LeCun et al. 2015, Schmidhuber 2015, Serre 2019)....
    • The Challenge of Big Data and Data Science

      Henry E. BradyDepartment of Political Science and Goldman School of Public Policy, University of California, Berkeley, California 94720, USA; email: [email protected]
      Annual Review of Political Science Vol. 22: 297 - 323
      • ...which involves multilayer classifiers that use stacks of logistic or similar regressions (Sarle 1994, Schmidhuber 2015) where the inputs are features of the items that are to be classified....
    • Deep Learning and Its Application to LHC Physics

      Dan Guest,1 Kyle Cranmer,2 and Daniel Whiteson11Department of Physics and Astronomy, University of California, Irvine, California 92697, USA2Physics Department, New York University, New York, NY 10003, USA
      Annual Review of Nuclear and Particle Science Vol. 68: 161 - 181
      • ...when a convergence of techniques enabled training of very large neural networks that greatly outperformed the previous state of the art (2...
      • ...when a convergence of techniques enabled training of very large neural networks that greatly outperformed the previous state of the art (2–5)....
    • Deep Learning in Biomedical Data Science

      Pierre BaldiDepartment of Computer Science, Institute for Genomics and Bioinformatics, and Center for Machine Learning and Intelligent Systems, University of California, Irvine, California 92697, USA; email: [email protected]
      Annual Review of Biomedical Data Science Vol. 1: 181 - 205
      • ...natural language processing, and games, to name just a few (3)....
    • Computational Methods for Understanding Mass Spectrometry–Based Shotgun Proteomics Data

      Pavel Sinitcyn, Jan Daniel Rudolph, and Jürgen CoxComputational Systems Biochemistry Research Group, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany; email: [email protected]
      Annual Review of Biomedical Data Science Vol. 1: 207 - 234
      • ...Deep learning (145, 146) is gaining traction in proteomics (75) and will likely find more applications in the future....
    • Personal Sensing: Understanding Mental Health Using Ubiquitous Sensors and Machine Learning

      David C. Mohr,1 Mi Zhang,2 and Stephen M. Schueller11Center for Behavioral Intervention Technologies and Department of Preventive Medicine, Northwestern University, Chicago, Illinois 60611; email: [email protected], [email protected]2Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824; email: [email protected]
      Annual Review of Clinical Psychology Vol. 13: 23 - 47
      • ...deep learning, a new trend in machine learning, has emerged (Schmidhuber 2015)....
    • Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing

      Nikolaus KriegeskorteMedical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, United Kingdom; email: [email protected]
      Annual Review of Vision Science Vol. 1: 417 - 446
      • ...neural network research has an unbroken history (Schmidhuber 2015) in theoretical neuroscience and in computer science....

  • 8. 
    Bengio Y. 2009. Learning Deep Architectures for AI: Foundations and Trends in Machine Learning. Boston: Now. 127 pp.
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
  • 9. 
    LeCun Y, Bengio Y, Hinton G. 2015. Deep learning. Nature 521:436–44
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    More AR articles citing this reference

    • Semantic Structure in Deep Learning

      Ellie PavlickDepartment of Computer Science, Brown University, Providence, Rhode Island, USA; email: [email protected]
      Annual Review of Linguistics Vol. 8: 447 - 471
      • ...traditional DSMs have fallen to the wayside in favor of linguistic representations derived from deep learning (LeCun et al. 2015)....
    • Machine Learning for the Study of Plankton and Marine Snow from Images

      Jean-Olivier Irisson,1 Sakina-Dorothée Ayata,1 Dhugal J. Lindsay,2 Lee Karp-Boss,3 and Lars Stemmann11Laboratoire d'Océanographie de Villefranche, Sorbonne Université, CNRS, F-06230 Villefranche-sur-Mer, France; email: [email protected], [email protected], [email protected]2Advanced Science-Technology Research (ASTER) Program, Institute for Extra-Cutting-Edge Science and Technology Avant-Garde Research (X-STAR), Japan Agency for Marine-Earth Science and Technology, Yokosuka, Kanagawa 237-0021, Japan; email: [email protected]3School of Marine Sciences, University of Maine, Orono, Maine 04469, USA; email: [email protected]
      Annual Review of Marine Science Vol. 14: 277 - 301
      • ...We now tend to separate classic machine learning from deep learning (LeCun et al. 2015)....
    • Small Steps with Big Data: Using Machine Learning in Energy and Environmental Economics

      Matthew C. Harding1 and Carlos Lamarche21Department of Economics and Department of Statistics, University of California, Irvine, California 92697; email: [email protected]2Department of Economics, Gatton College of Business and Economics, University of Kentucky, Lexington, Kentucky 40506
      Annual Review of Resource Economics Vol. 13: 469 - 488
      • ...the literature has concentrated its attention on multilayer networks and generalizations to Deep Learning (LeCun et al. 2015, Farrell et al. 2018)....
    • Face Recognition by Humans and Machines: Three Fundamental Advances from Deep Learning

      Alice J. O'Toole1 and Carlos D. Castillo21School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas 75080, USA; email: [email protected]2Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA; email: [email protected]
      Annual Review of Vision Science Vol. 7: 543 - 570
      • ...DCNNs also emulate computational aspects of the ventral visual system (Fukushima 1988, Krizhevsky et al. 2012, LeCun et al. 2015) and support surprisingly direct, ...
    • Spatial Integration in Normal Face Processing and Its Breakdown in Congenital Prosopagnosia

      Galia Avidan1 and Marlene Behrmann21Department of Psychology and Department of Cognitive and Brain Sciences, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel; email: [email protected]2Department of Psychology and Neuroscience Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
      Annual Review of Vision Science Vol. 7: 301 - 321
      • ...focusing on a certain location of the stimulus enables the processing of information at that location but also the generation of a prediction of the next location to be processed (Ji et al. 2013, LeCun et al. 2015) (for an illustration of a model dCNN as a coarse analogy to ventral pathway function, ...
    • Optical Coherence Tomography and Glaucoma

      Alexi Geevarghese,1 Gadi Wollstein,1,2,3 Hiroshi Ishikawa,1,2 and Joel S. Schuman1,2,3,41Department of Ophthalmology, NYU Langone Health, NYU Grossman School of Medicine, New York, NY 10016, USA; email: [email protected]2Department of Biomedical Engineering, NYU Tandon School of Engineering, Brooklyn, New York 11201, USA3Center for Neural Science, NYU College of Arts and Sciences, New York, NY 10003, USA4Department of Physiology and Neuroscience, NYU Langone Health, NYU Grossman School of Medicine, New York, NY 10016, USA
      Annual Review of Vision Science Vol. 7: 693 - 726
      • ...Importance is applied to each node on the basis of an iterative training process that determines the optimal weights that yield the smallest classification error (LeCun et al. 2015, Zheng et al. 2019). ...
    • Quantitative Molecular Positron Emission Tomography Imaging Using Advanced Deep Learning Techniques

      Habib Zaidi1,2,3,4 and Issam El Naqa5,6,71Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, 1211 Geneva, Switzerland; email: [email protected]2Geneva Neuroscience Centre, University of Geneva, 1205 Geneva, Switzerland3Department of Nuclear Medicine and Molecular Imaging, University of Groningen, 9700 RB Groningen, Netherlands4Department of Nuclear Medicine, University of Southern Denmark, DK-5000 Odense, Denmark5Department of Machine Learning, Moffitt Cancer Center, Tampa, Florida 33612, USA6Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan 48109, USA7Department of Oncology, McGill University, Montreal, Quebec H3A 1G5, Canada
      Annual Review of Biomedical Engineering Vol. 23: 249 - 276
      • ...but recent studies have shown that it is most effective with deep neural network (DNN) methods due to their universal approximation nature (42, 43)....
    • Extension of Plant Phenotypes by the Foliar Microbiome

      Christine V. Hawkes,1 Rasmus Kjøller,2 Jos M. Raaijmakers,3 Leise Riber,4 Svend Christensen,4 Simon Rasmussen,5 Jan H. Christensen,4 Anders Bjorholm Dahl,6 Jesper Cairo Westergaard,4 Mads Nielsen,7 Gina Brown-Guedira,8 and Lars Hestbjerg Hansen41Department of Plant and Microbial Biology, North Carolina State University, Raleigh, North Carolina 27695, USA; email: [email protected]2Department of Biology, University of Copenhagen, 2100 Copenhagen Ø, Denmark; email: [email protected]3Department of Microbial Ecology, Netherlands Institute of Ecology, 6708 PB Wageningen, The Netherlands; email: [email protected]4Department of Plant and Environmental Sciences, University of Copenhagen, 1871 Frederiksberg C, Denmark; email: [email protected], [email protected], [email protected], [email protected], [email protected]5Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen N, Denmark; email: [email protected]6Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Lyngby, Denmark; email: [email protected]7Department of Computer Science, University of Copenhagen, 2100 Copenhagen Ø, Denmark; email: [email protected]8Plant Science Research Unit, USDA Agricultural Research Service and Department of Crop and Soil Sciences, North Carolina State University, Raleigh, North Carolina 27695, USA; email: [email protected]
      Annual Review of Plant Biology Vol. 72: 823 - 846
      • ...and neural networks have been applied in a wide array of fields within the last 50 years (for an overview, see 75)....
    • Applications of Machine and Deep Learning in Adaptive Immunity

      Margarita Pertseva,1,2 Beichen Gao,1 Daniel Neumeier,1 Alexander Yermanos,1,3,4 and Sai T. Reddy11Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; email: [email protected]2Life Science Zurich Graduate School, ETH Zurich and University of Zurich, 8006 Zurich, Switzerland3Department of Pathology and Immunology, University of Geneva, 1205 Geneva, Switzerland4Department of Biology, Institute of Microbiology and Immunology, ETH Zurich, 8093 Zurich, Switzerland
      Annual Review of Chemical and Biomolecular Engineering Vol. 12: 39 - 62
      • ...one of its major limitations is that the feature extraction step can be tedious and often requires domain-specific knowledge (50)....
      • ...DL uses a class of algorithms that find a relevant set of features required to perform a particular task in a more automated manner (50)....
      • ...the computed result of one layer acts as an input to the next layer, resulting in an increasingly abstract data representation (50)....
      • ...Full coverage of DL models is outside the scope of this review; the interested reader could refer to several additional resources (50, 51, 55)....
    • Syntactic Structure from Deep Learning

      Tal Linzen1 and Marco Baroni2,3,41Department of Linguistics and Center for Data Science, New York University, New York, NY 10003, USA; email: [email protected]2Facebook AI Research, Paris 75002, France; email: [email protected]3Catalan Institute for Research and Advanced Studies, Barcelona 08010, Spain4Departament de Traducció i Ciències del Llenguatge, Universitat Pompeu Fabra, Barcelona 08018, Spain
      Annual Review of Linguistics Vol. 7: 195 - 212
      • ...which have been rebranded as deep learning (LeCun et al. 2015), ...
    • Toward Realizing the Promise of Educational Neuroscience: Improving Experimental Design in Developmental Cognitive Neuroscience Studies

      Usha GoswamiCentre for Neuroscience in Education, University of Cambridge, Cambridge CB2 3EB, United Kingdom; email: [email protected]
      Annual Review of Developmental Psychology Vol. 2: 133 - 155
      • ...clusters of real medical symptoms) and then acquire expertise that can exceed that of human operators (for example, in medical diagnosis; see LeCun et al. 2015)....
      • ...“a giraffe is standing in the forest with trees in the background”; LeCun et al. 2015)....
    • Spatial Metabolomics and Imaging Mass Spectrometry in the Age of Artificial Intelligence

      Theodore Alexandrov1,21Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany; email: [email protected]2Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093, USA
      Annual Review of Biomedical Data Science Vol. 3: 61 - 87
      • ...a method that has transformed machine learning by outperforming other methods, first for computer vision and later for other problems (49)....
    • Identifying Regulatory Elements via Deep Learning

      Mira Barshai,1, Eitamar Tripto,2, and Yaron Orenstein11School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel; email: [email protected]2Department of Biomedical Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel
      Annual Review of Biomedical Data Science Vol. 3: 315 - 338
      • ...termed deep learning, has been revolutionizing the data science world (45)....
      • ...Prediction accuracy has been improving tremendously for image and text processing tasks (45)....
    • Computational Approaches for Unraveling the Effects of Variation in the Human Genome and Microbiome

      Chengsheng Zhu,1 Maximilian Miller,1 Zishuo Zeng,1 Yanran Wang,1 Yannick Mahlich,1 Ariel Aptekmann,1 and Yana Bromberg1,21Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA; email: [email protected], [email protected]2Department of Genetics, Rutgers University, Piscataway, New Jersey 08854, USA
      Annual Review of Biomedical Data Science Vol. 3: 411 - 432
      • ...a class of machine learning algorithms well suited to processing high-dimensional data, provide new means for this type of analysis (185)....
    • Synaptic Plasticity Forms and Functions

      Jeffrey C. Magee and Christine GrienbergerDepartment of Neuroscience and Howard Hughes Medical Institute, Baylor College of Medicine, Houston, Texas 77030, USA; email: [email protected]
      Annual Review of Neuroscience Vol. 43: 95 - 117
      • ...the learning rules used are essentially the same (Woodrow & Hoff 1960, Rumelhart et al. 1986, LeCun et al. 2015) (Figure 2c–e)....
      • ...While there are relatively straightforward methods to accomplish this in ANNs (Rumelhart et al. 1986, LeCun et al. 2015), ...
    • Opportunities and Challenges for Machine Learning in Materials Science

      Dane Morgan and Ryan JacobsDepartment of Materials Science and Engineering, University of Wisconsin–Madison, Madison, Wisconsin 53706, USA; email: [email protected], [email protected]
      Annual Review of Materials Research Vol. 50: 71 - 103
      • ...The large number of ML models and their many technical details are well covered in many texts and reviews (41–43, 151), ...
    • Machine Learning in Materials Discovery: Confirmed Predictions and Their Underlying Approaches

      James E. Saal,1 Anton O. Oliynyk,2 and Bryce Meredig11Citrine Informatics, Redwood City, California 94063, USA; email: [email protected]2Department of Chemistry and Biochemistry, Manhattan College, Riverdale, New York 10471, USA
      Annual Review of Materials Research Vol. 50: 49 - 69
      • ... and (deep) neural network (NN) (61, 62) algorithms are illustrated conceptually in Figure 5. ...
    • Machine Learning for Molecular Simulation

      Frank Noé,1,2,3 Alexandre Tkatchenko,4 Klaus-Robert Müller,5,6,7 and Cecilia Clementi1,3,81Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany; email: [email protected]2Department of Physics, Freie Universität Berlin, 14195 Berlin, Germany3Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA; email: [email protected]4Physics and Materials Science Research Unit, University of Luxembourg, 1511 Luxembourg, Luxembourg; email: [email protected]5Department of Computer Science, Technical University Berlin, 10587 Berlin, Germany; email: [email protected]6Max-Planck-Institut für Informatik, 66123 Saarbrücken, Germany7Department of Brain and Cognitive Engineering, Korea University, Seoul 136-713, South Korea8Department of Physics, Rice University, Houston, Texas 77005, USA
      Annual Review of Physical Chemistry Vol. 71: 361 - 390
      • ...and we refer to the literature for an introduction to statistical learning theory (3, 4) and deep learning (5, 6)....
    • Machine-Learning Quantum States in the NISQ Era

      Giacomo Torlai1 and Roger G. Melko2,31Center for Computational Quantum Physics, Flatiron Institute, New York, NY 10010, USA; email: [email protected]2Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada; email: [email protected]3Perimeter Institute for Theoretical Physics, Waterloo, Ontario N2L 2Y5, Canada
      Annual Review of Condensed Matter Physics Vol. 11: 325 - 344
      • ...Artificial neural networks, the bedrock of modern machine learning and artificial intelligence (8), ...
    • Statistical Mechanics of Deep Learning

      Yasaman Bahri,1 Jonathan Kadmon,2 Jeffrey Pennington,1 Sam S. Schoenholz,1 Jascha Sohl-Dickstein,1 and Surya Ganguli1,21Google Brain, Google Inc., Mountain View, California 94043, USA2Department of Applied Physics, Stanford University, Stanford, California 94035, USA; email: [email protected]
      Annual Review of Condensed Matter Physics Vol. 11: 501 - 528
      • ...Deep neural networks, with multiple hidden layers (1), have achieved remarkable success across many fields, ...
    • Use of Mechanistic Nutrition Models to Identify Sustainable Food Animal Production

      Mark D. Hanigan1 and Veridiana L. Daley1,21Department of Dairy Science, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061, USA; email: [email protected], [email protected]2National Animal Nutrition Program (NANP), Department of Animal & Food Sciences, University of Kentucky, Lexington, Kentucky 40546, USA
      Annual Review of Animal Biosciences Vol. 8: 355 - 376
      • ...outputs, and detection of patterns in the input variables (121); thus, ...
    • Distributional Semantics and Linguistic Theory

      Gemma Boleda1,21Department of Translation and Language Sciences, Universitat Pompeu Fabra, Barcelona 08018, Spain; email: [email protected]2Catalan Institution for Research and Advanced Studies (ICREA), Barcelona 08010, Spain
      Annual Review of Linguistics Vol. 6: 213 - 234
      • ...Neural networks are a type of machine learning algorithm, recently revamped as deep learning (LeCun et al. 2015), ...
    • Big Data and Artificial Intelligence Modeling for Drug Discovery

      Hao ZhuDepartment of Chemistry and Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey 08102, USA; email: [email protected]
      Annual Review of Pharmacology and Toxicology Vol. 60: 573 - 589
      • ...The milestone paper of deep learning was published at almost the same time (103), ...
    • Machine Learning for Fluid Mechanics

      Steven L. Brunton,1 Bernd R. Noack,2,3 and Petros Koumoutsakos41Department of Mechanical Engineering, University of Washington, Seattle, Washington 98195, USA2LIMSI (Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur), CNRS UPR 3251, Université Paris-Saclay, F-91403 Orsay, France3Institut für Strömungsmechanik und Technische Akustik, Technische Universität Berlin, D-10634 Berlin, Germany4Computational Science and Engineering Laboratory, ETH Zurich, CH-8092 Zurich, Switzerland; email: [email protected]
      Annual Review of Fluid Mechanics Vol. 52: 477 - 508
      • ...which sparked the current movement in deep learning (LeCun et al. 2015)....
    • Concepts and Compositionality: In Search of the Brain's Language of Thought

      Steven M. Frankland1 and Joshua D. Greene21Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey 08544, USA; email: [email protected]2Department of Psychology and Center for Brain Science, Harvard University, Cambridge, Massachusetts 02138, USA; email: [email protected]
      Annual Review of Psychology Vol. 71: 273 - 303
      • ...proponents of the LoT hypothesis suspect that human comprehension depends on complex semantic representations with internal representations that are far more structurally constrained. LeCun et al. (2015, ...
    • Data-Driven Approaches to Understanding Visual Neuron Activity

      Daniel A. ButtsDepartment of Biology and Program in Neuroscience and Cognitive Science, University of Maryland, College Park, Maryland 20742, USA; email: [email protected]
      Annual Review of Vision Science Vol. 5: 451 - 477
      • ...the recent machine learning–driven successes in computer vision (Kriegeskorte 2015, Krizhevsky et al. 2012, LeCun et al. 2015, Serre 2019) suggest a new range of possible approaches, ...
      • ...making such methods broadly accessible for using DNNs to fit a larger variety of data (LeCun et al. 2015)....
      • ...refer to solving tasks such as object and face recognition and have played a crucial role in driving the development of DNNs (LeCun et al. 2015, Schmidhuber 2015, Serre 2019)....
    • Machine Learning Methods That Economists Should Know About

      Susan Athey1,2,3 and Guido W. Imbens1,2,3,41Graduate School of Business, Stanford University, Stanford, California 94305, USA; email: [email protected], [email protected]2Stanford Institute for Economic Policy Research, Stanford University, Stanford, California 94305, USA3National Bureau of Economic Research, Cambridge, Massachusetts 02138, USA4Department of Economics, Stanford University, Stanford, California 94305, USA
      Annual Review of Economics Vol. 11: 685 - 725
      • ...This suggests that using a deep model expresses a useful preference over the space of functions the model can learn. (LeCun et al. 2015, ...
    • Computational and Informatic Advances for Reproducible Data Analysis in Neuroimaging

      Russell A. Poldrack,1 Krzysztof J. Gorgolewski,1 and Gaël Varoquaux21Department of Psychology, Stanford University, Stanford, California 94305, USA; email: [email protected]2Parietal Team, Inria and NeuroSpin/CEA (Atomic Energy Commission), 91191 Gif/-sur-Yvette, France
      Annual Review of Biomedical Data Science Vol. 2: 119 - 138
      • ...Machine learning has opened new alleys in extracting information from texts, images, genomes, etc. (42), ...
    • Scientific Discovery Games for Biomedical Research

      Rhiju Das,1 Benjamin Keep,2Peter Washington,3 and Ingmar H. Riedel-Kruse31Department of Biochemistry and Department of Physics, Stanford University, Stanford, California 94305, USA; email: [email protected]2Department of Learning Sciences, Stanford University, Stanford, California 94305, USA3Department of Bioengineering, Stanford University, Stanford, California 94305, USA; email: [email protected]
      Annual Review of Biomedical Data Science Vol. 2: 253 - 279
      • ...it will be important to compare results to more recent algorithmic methods for the same visual tasks, which have been improving at an impressive pace (85)....
    • The Challenge of Big Data and Data Science

      Henry E. BradyDepartment of Political Science and Goldman School of Public Policy, University of California, Berkeley, California 94720, USA; email: [email protected]
      Annual Review of Political Science Vol. 22: 297 - 323
      • ...has succeeded at difficult pattern recognition tasks such as speech and image recognition, natural language processing, and bioinformatics (LeCun et al. 2015)....
    • System Identification: A Machine Learning Perspective

      A. Chiuso and G. PillonettoDepartment of Information Engineering, University of Padova, 35131 Padova, Italy; email: [email protected]
      Annual Review of Control, Robotics, and Autonomous Systems Vol. 2: 281 - 304
      • ...which have recently garnered renewed interest thanks to deep networks’ success in classification and pattern recognition (7)....
    • Deep Learning and Its Application to LHC Physics

      Dan Guest,1 Kyle Cranmer,2 and Daniel Whiteson11Department of Physics and Astronomy, University of California, Irvine, California 92697, USA2Physics Department, New York University, New York, NY 10003, USA
      Annual Review of Nuclear and Particle Science Vol. 68: 161 - 181
      • ...when a convergence of techniques enabled training of very large neural networks that greatly outperformed the previous state of the art (2...
    • Invariant Recognition Shapes Neural Representations of Visual Input

      Andrea Tacchetti, Leyla Isik, and Tomaso A. PoggioCenter for Brains, Minds and Machines, MIT, Cambridge, Massachusetts 02139, USA; email: [email protected], [email protected], [email protected]
      Annual Review of Vision Science Vol. 4: 403 - 422
      • ... and the availability of powerful computational models (Serre et al. 2007a, Kriegeskorte 2015, LeCun et al. 2015), ...
      • ...specific instances of this class of models achieved human-level performance on a number of perceptual tasks (Kriegeskorte 2015, LeCun et al. 2015), ...
      • ...and one model with convolutional templates learned by optimizing performance on an action recognition task (LeCun et al. 2015)....
    • Hyperspectral Sensors and Imaging Technologies in Phytopathology: State of the Art

      A.-K. Mahlein,1 M.T. Kuska,2 J. Behmann,2 G. Polder,3 and A. Walter41Institute of Sugar Beet Research (IfZ), 37079 Göttingen, Germany; email: [email protected]2Institute of Crop Science and Resource Conservation (INRES)–Plant Diseases and Plant Protection, University of Bonn, 53115 Bonn, Germany3Greenhouse Horticulture, Wageningen University and Research, 6708PB Wageningen, Netherlands4Institute of Agricultural Sciences, ETH Zürich, 8092 Zürich, Switzerland
      Annual Review of Phytopathology Vol. 56: 535 - 558
      • ...Recently, deep learning arose for data analysis from machine learning (86)....
      • ...the general trend to use deep learning approaches has changed the way of data interpretation in many application fields (86)....
    • Computational Methods for Understanding Mass Spectrometry–Based Shotgun Proteomics Data

      Pavel Sinitcyn, Jan Daniel Rudolph, and Jürgen CoxComputational Systems Biochemistry Research Group, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany; email: [email protected]
      Annual Review of Biomedical Data Science Vol. 1: 207 - 234
      • ...Deep learning (145, 146) is gaining traction in proteomics (75) and will likely find more applications in the future....
    • Defining Phenotypes from Clinical Data to Drive Genomic Research

      Jamie R. Robinson,1,2 Wei-Qi Wei,1 Dan M. Roden,1,3,4 and Joshua C. Denny1,31Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA; email: [email protected]2Department of General Surgery, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA3Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA4Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA
      Annual Review of Biomedical Data Science Vol. 1: 69 - 92
      • ...The key aspect of deep learning is that these layers of features are learned from the data rather than designed by domain experts (86)....
    • Toward an Integrative Theory of Thalamic Function

      Rajeev V. Rikhye,1,2 Ralf D. Wimmer,1,3 and Michael M. Halassa1,2,31Department of Brain and Cognitive Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; email: [email protected]2McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA3Stanley Center for Psychiatric Genetics, Broad Institute, Cambridge, Massachusetts 02139, USA
      Annual Review of Neuroscience Vol. 41: 163 - 183
      • ...and recent advances in coupling artificial HCNNs with more efficient learning algorithms have given rise to the revolution of machines that are almost on par with humans in their ability to recognize objects (Hassabis et al. 2017, LeCun et al. 2015). ...
    • Computational Principles of Supervised Learning in the Cerebellum

      Jennifer L. Raymond1 and Javier F. Medina21Department of Neurobiology, Stanford University School of Medicine, Stanford, California 94305, USA; email: [email protected]2Department of Neuroscience, Baylor College of Medicine, Houston, Texas 77030, USA; email: [email protected]
      Annual Review of Neuroscience Vol. 41: 233 - 253
      • ...the process of finding a suitable representation of the input data is called feature engineering and is a critical step that often determines whether the algorithm will succeed or fail (Bengio et al. 2013, LeCun et al. 2015). (c) Instructive signals compose the third element....
    • Machine Learning Approaches for Clinical Psychology and Psychiatry

      Dominic B. Dwyer, Peter Falkai, and Nikolaos KoutsoulerisDepartment of Psychiatry and Psychotherapy, Section for Neurodiagnostic Applications, Ludwig-Maximilian University, Munich 80638, Germany; email: [email protected], [email protected].uni-muenchen.de, [email protected]
      Annual Review of Clinical Psychology Vol. 14: 91 - 118
      • ...The idea of meta-learning is an important concept in fields such as deep learning (LeCun et al. 2015), ...
    • Big Data in Public Health: Terminology, Machine Learning, and Privacy

      Stephen J. Mooney1 and Vikas Pejaver21Harborview Injury Prevention and Research Center, University of Washington, Seattle, Washington 98122, USA; email: [email protected]2Department of Biomedical Informatics and Medical Education and the eScience Institute, University of Washington, Seattle, Washington 98109, USA; email: [email protected]
      Annual Review of Public Health Vol. 39: 95 - 112
      • ...have been used extensively in image classification and natural language processing (68)....
    • Computational Neuroscience: Mathematical and Statistical Perspectives

      Robert E. Kass,1 Shun-Ichi Amari,2 Kensuke Arai,3 Emery N. Brown,4,5 Casey O. Diekman,6 Markus Diesmann,7,8 Brent Doiron,9 Uri T. Eden,3 Adrienne L. Fairhall,10 Grant M. Fiddyment,3 Tomoki Fukai,2 Sonja Grün,7,8 Matthew T. Harrison,11 Moritz Helias,7,8 Hiroyuki Nakahara,2 Jun-nosuke Teramae,12 Peter J. Thomas,13 Mark Reimers,14 Jordan Rodu,15 Horacio G. Rotstein,16,17 Eric Shea-Brown,10 Hideaki Shimazaki,18,19 Shigeru Shinomoto,19 Byron M. Yu,20 and Mark A. Kramer31Department of Statistics, Machine Learning Department, and Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA; email: [email protected]2Mathematical Neuroscience Laboratory, RIKEN Brain Science Institute, Wako, Saitama Prefecture 351-0198, Japan3Department of Mathematics and Statistics, Boston University, Boston, Massachusetts 02215, USA4Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA5Department of Anesthesia, Harvard Medical School, Boston, Massachusetts 02115, USA6Department of Mathematical Sciences, New Jersey Institute of Technology, Newark, New Jersey 07102, USA7Institute of Neuroscience and Medicine, Jülich Research Centre, 52428 Jülich, Germany8Department of Theoretical Systems Neurobiology, Institute of Biology, RWTH Aachen University, 52062 Aachen, Germany9Department of Mathematics, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA10Department of Physiology and Biophysics, University of Washington, Seattle, Washington 98105, USA11Division of Applied Mathematics, Brown University, Providence, Rhode Island 02912, USA12Department of Integrated Theoretical Neuroscience, Osaka University, Suita, Osaka Prefecture 565-0871, Japan13Department of Mathematics, Applied Mathematics, and Statistics, Case Western Reserve University, Cleveland, Ohio 44106, USA14Department of Neuroscience, Michigan State University, East Lansing, Michigan 48824, USA15Department of Statistics, University of Virginia, Charlottesville, Virginia 22904, USA16Federated Department of Biological Sciences, Rutgers University/New Jersey Institute of Technology, Newark, New Jersey 07102, USA17Institute for Brain and Neuroscience Research, New Jersey Institute of Technology, Newark, New Jersey 07102, USA18Honda Research Institute Japan, Wako, Saitama Prefecture 351-0188, Japan19Department of Physics, Kyoto University, Kyoto, Kyoto Prefecture 606-8502, Japan20Department of Electrical and Computer Engineering and Department of Biomedical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
      Annual Review of Statistics and Its Application Vol. 5: 183 - 214
      • ...3.4.5. Deep learning.Deep learning (le Cun et al. 2015) is an outgrowth of PDP modeling (see Section 1.4)....
      • ...receptive fields (le Cun et al. 2015) identify a very specific input pattern, ...
    • Neural Circuitry of Reward Prediction Error

      Mitsuko Watabe-Uchida,1, Neir Eshel,1,2, and Naoshige Uchida11Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, Massachusetts 02138; email: [email protected], [email protected]2Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305; email: [email protected]
      Annual Review of Neuroscience Vol. 40: 373 - 394
      • ...versus simple box-and-arrow computations? As is the case in modern artificial neural networks (LeCun et al. 2015), ...
    • Toward a Rational and Mechanistic Account of Mental Effort

      Amitai Shenhav,1,2 Sebastian Musslick,3 Falk Lieder,4 Wouter Kool,5 Thomas L. Griffiths,6 Jonathan D. Cohen,3,7 and Matthew M. Botvinick8,91Department of Cognitive, Linguistic and Psychological Sciences, Brown University, Providence, Rhode Island 02912; email: [email protected]2Brown Institute for Brain Science, Brown University, Providence, Rhode Island 029123Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey 085444Helen Wills Neuroscience Institute, University of California, Berkeley, California 947205Department of Psychology, Harvard University, Cambridge, Massachusetts 021386Department of Psychology, University of California, Berkeley, California 947207Department of Psychology, Princeton University, Princeton, New Jersey 085408Google DeepMind, London M1C 4AG, United Kingdom9Gatsby Computational Neuroscience Unit, University College London, London W1T 4JG, United Kingdom
      Annual Review of Neuroscience Vol. 40: 99 - 124
      • ... and is driving the current explosion of interest in deep learning networks within the machine learning community (Bengio et al. 2013, Caruana 1998, LeCun et al. 2015)....
    • The Role of Variability in Motor Learning

      Ashesh K. Dhawale,1,2 Maurice A. Smith,2,3 and Bence P. Ölveczky1,21Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138; email: [email protected]2Center for Brain Science, Harvard University, Cambridge, Massachusetts 021383John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138
      Annual Review of Neuroscience Vol. 40: 479 - 498
      • ...been due to the use of convolutional network architectures that reduce dramatically the dimensionality of the solution space by enforcing highly symmetric patterns in the weights to be learned (LeCun et al. 1998, 2015...
      • ...Another key to the success of deep learning networks has been the use of unsupervised methods to pretrain networks based on the statistics of the input data (Hinton et al. 2006, LeCun et al. 2015, Lee et al. 2009)....
    • Personal Sensing: Understanding Mental Health Using Ubiquitous Sensors and Machine Learning

      David C. Mohr,1 Mi Zhang,2 and Stephen M. Schueller11Center for Behavioral Intervention Technologies and Department of Preventive Medicine, Northwestern University, Chicago, Illinois 60611; email: [email protected], [email protected]2Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824; email: [email protected]
      Annual Review of Clinical Psychology Vol. 13: 23 - 47
      • ...they do not generalize well to challenging problems involving large-scale datasets (LeCun et al. 2015)....
    • Visual Object Recognition: Do We (Finally) Know More Now Than We Did?

      Isabel Gauthier1 and Michael J. Tarr21Department of Psychology, Vanderbilt University, Nashville, Tennessee 37240-7817; email: [email protected]2Department of Psychology, Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
      Annual Review of Vision Science Vol. 2: 377 - 396
      • ...most typically embodied—as illustrated in Figure 3—in convolutional neural networks (CNNs) (LeCun et al. 2015). (By one estimate, ...
      • ...Figure and caption adapted, with permission, from LeCun et al. (2015)....
    • Early Visual Cortex as a Multiscale Cognitive Blackboard

      Pieter R. Roelfsema1,2,3 and Floris P. de Lange41Netherlands Institute for Neuroscience, 1105 BA Amsterdam, The Netherlands; email: [email protected]2Department of Integrative Neurophysiology, VU University Amsterdam, 1081 HV Amsterdam, The Netherlands3Psychiatry Department, Academic Medical Center, 1105 AZ Amsterdam, The Netherlands4Donders Institute for Brain, Cognition and Behavior, Radboud University, 6525 EN Nijmegen, The Netherlands
      Annual Review of Vision Science Vol. 2: 131 - 151
      • ...Recent progress in deep learning has been made in the recognition of semantic categories in photographs by using neural networks consisting of many layers (LeCun et al. 2015)....
    • Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing

      Nikolaus KriegeskorteMedical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, United Kingdom; email: [email protected]
      Annual Review of Vision Science Vol. 1: 417 - 446
      • ...I argue that recent advances in neural network models (LeCun et al. 2015) will usher in a new era of computational neuroscience, ...

  • 10. 
    Hinton GE, Salakhutdinov RR. 2006. Reducing the dimensionality of data with neural networks. Science 313:504–7
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
    • Article Location
    More AR articles citing this reference

    • Quantitative Molecular Positron Emission Tomography Imaging Using Advanced Deep Learning Techniques

      Habib Zaidi1,2,3,4 and Issam El Naqa5,6,71Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, 1211 Geneva, Switzerland; email: [email protected]2Geneva Neuroscience Centre, University of Geneva, 1205 Geneva, Switzerland3Department of Nuclear Medicine and Molecular Imaging, University of Groningen, 9700 RB Groningen, Netherlands4Department of Nuclear Medicine, University of Southern Denmark, DK-5000 Odense, Denmark5Department of Machine Learning, Moffitt Cancer Center, Tampa, Florida 33612, USA6Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan 48109, USA7Department of Oncology, McGill University, Montreal, Quebec H3A 1G5, Canada
      Annual Review of Biomedical Engineering Vol. 23: 249 - 276
      • ...but recent studies have shown that it is most effective with deep neural network (DNN) methods due to their universal approximation nature (42, 43)....
    • Learning in Infancy Is Active, Endogenously Motivated, and Depends on the Prefrontal Cortices

      Gal Raz1 and Rebecca Saxe1,21Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; email: [email protected]2McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
      Annual Review of Developmental Psychology Vol. 2: 247 - 268
      • ...The unsupervised pretraining phase allows the network to discover statistical structure in the data before ever being exposed to the true labels and so can prevent overfitting and improve generalization (Hinton 2006)....
    • Computational Approaches for Unraveling the Effects of Variation in the Human Genome and Microbiome

      Chengsheng Zhu,1 Maximilian Miller,1 Zishuo Zeng,1 Yanran Wang,1 Yannick Mahlich,1 Ariel Aptekmann,1 and Yana Bromberg1,21Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey 08873, USA; email: [email protected], [email protected]2Department of Genetics, Rutgers University, Piscataway, New Jersey 08854, USA
      Annual Review of Biomedical Data Science Vol. 3: 411 - 432
      • ...Other implementations such as autoencoders [unsupervised artificial neural networks used to learn efficient data encoding (186)] allow researchers to first compress the input dimensionality and train the network in a lower-dimensional space....
    • Invariant Recognition Shapes Neural Representations of Visual Input

      Andrea Tacchetti, Leyla Isik, and Tomaso A. PoggioCenter for Brains, Minds and Machines, MIT, Cambridge, Massachusetts 02139, USA; email: [email protected], [email protected], [email protected]
      Annual Review of Vision Science Vol. 4: 403 - 422
      • ...template dictionaries for convolutional layers are either learned by optimizing performance on supervised (LeCun et al. 1989) or unsupervised (Hinton & Salakhutdinov 2006, Mutch & Lowe 2008) tasks....
    • Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing

      Nikolaus KriegeskorteMedical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, United Kingdom; email: [email protected]
      Annual Review of Vision Science Vol. 1: 417 - 446
      • ...An instructive example of unsupervised learning is provided by autoencoders (Hinton & Salakhutdinov 2006)....
    • Learning Deep Generative Models

      Ruslan SalakhutdinovDepartments of Computer Science and Statistical Sciences, University of Toronto, Toronto M5S 3G4, Canada; email: [email protected]
      Annual Review of Statistics and Its Application Vol. 2: 361 - 385
      • ...speech recognition (Hinton et al. 2012, Mohamed et al. 2012), dimensionality reduction (Hinton & Salakhutdinov 2006, Salakhutdinov & Hinton 2007), ...
      • ...one can easily extend RBMs to the Gaussian–Bernoulli variant (Hinton & Salakhutdinov 2006)....
      • ...predetermined value for σ2 (Nair & Hinton 2009, Hinton & Salakhutdinov 2006)....
      • ...this is the actual algorithm commonly used in practice (Hinton & Salakhutdinov 2006, Taylor et al. 2006, Torralba et al. 2008, Bengio 2009)....
      • ...such as principal component analysis (PCA) and singular value decomposition (SVD) (Hinton & Salakhutdinov 2006)....
      • ...achieved by the DBN described in articles by Hinton & Salakhutdinov (2006)...

  • 11. 
    Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA. 2010. Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11:3371–408
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    More AR articles citing this reference

    • Personal Sensing: Understanding Mental Health Using Ubiquitous Sensors and Machine Learning

      David C. Mohr,1 Mi Zhang,2 and Stephen M. Schueller11Center for Behavioral Intervention Technologies and Department of Preventive Medicine, Northwestern University, Chicago, Illinois 60611; email: [email protected], [email protected]2Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824; email: [email protected]
      Annual Review of Clinical Psychology Vol. 13: 23 - 47
      • ...such as slow feature analysis and stacked autoencoders (Vincent et al. 2010, Wiskott & Sejnowski 2002), ...

  • 12. 
    Nair V, Hinton GE. 2010. Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning, pp. 807–14. New York: ACM
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
  • 13. 
    Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. 2014. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15:1929–58
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    More AR articles citing this reference

    • Analysis of MRI Data in Diagnostic Neuroradiology

      Saima Rathore,1, Ahmed Abdulkadir,1,2, and Christos Davatzikos11Center for Biomedical Image Computing and Analytics, Department of Radiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA; email: [email protected]2University Hospital of of Old Age Psychiatry and Psychotherapy, University of Bern, 3008 Bern, Switzerland
      Annual Review of Biomedical Data Science Vol. 3: 365 - 390
      • ...Training deep neural networks with class-preserving augmentation, generative adversarial neural networks (61), Bernoulli dropout (62), ...
    • Sparse Data–Driven Learning for Effective and Efficient Biomedical Image Segmentation

      John A. Onofrey,1,2 Lawrence H. Staib,1,3 Xiaojie Huang,1,6 Fan Zhang,1 Xenophon Papademetris,1,3 Dimitris Metaxas,4 Daniel Rueckert,5 and James S. Duncan1,31Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, Connecticut 06520, USA; email: [email protected]2Department of Urology, Yale School of Medicine, New Haven, Connecticut 06520, USA3Department of Biomedical Engineering, Yale University, New Haven, Connecticut 06520, USA; email: [email protected]4Department of Computer Science, Rutgers University, Piscataway, New Jersey 08854, USA5Department of Computing, Imperial College London, London SW7 2AZ, United Kingdom6Citadel Securities, Chicago, Illinois 60603, USA
      Annual Review of Biomedical Engineering Vol. 22: 127 - 153
      • ...The idea behind dropout is to randomly select a number of the neurons at each stage of training (as many as 80%) and set the corresponding weights to zero (68)....
      • ...Srivastava et al. (68) found that, as a side effect of dropout, ...
    • Machine Learning in Epidemiology and Health Outcomes Research

      Timothy L. Wiemken1 and Robert R. Kelley21Center for Health Outcomes Research, Saint Louis University, Saint Louis, Missouri 63104, USA; email: [email protected]2Department of Computer Science, Bellarmine University, Louisville, Kentucky 40205, USA; email: [email protected]
      Annual Review of Public Health Vol. 41: 21 - 36
      • ...An invalid model may result if investigators do not provide appropriate values (59)....
    • Sentiment Analysis

      Robert A. StineDepartment of Statistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA; email: [email protected]
      Annual Review of Statistics and Its Application Vol. 6: 287 - 308
      • ...dropout (a technique specifically designed to regularize a NN; Srivastava et al. 2014), ...

  • 14. 
    Ioffe S, Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, pp. 448–56. New York: ACM
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
  • 15. 
    Bishop CM. 1995. Neural Networks for Pattern Recognition. Oxford, UK: Oxford Univ. Press
    • Google Scholar
    Article Location
    More AR articles citing this reference

    • Applications of Machine and Deep Learning in Adaptive Immunity

      Margarita Pertseva,1,2 Beichen Gao,1 Daniel Neumeier,1 Alexander Yermanos,1,3,4 and Sai T. Reddy11Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland; email: [email protected]2Life Science Zurich Graduate School, ETH Zurich and University of Zurich, 8006 Zurich, Switzerland3Department of Pathology and Immunology, University of Geneva, 1205 Geneva, Switzerland4Department of Biology, Institute of Microbiology and Immunology, ETH Zurich, 8093 Zurich, Switzerland
      Annual Review of Chemical and Biomolecular Engineering Vol. 12: 39 - 62
      • ...it is not computationally efficient, because it is sparse and high dimensional (65)....
      • ...as the number of features exceeds the number of data points, a phenomenon known as the curse of dimensionality (65)....
    • Multivariate Analysis Methods in Particle Physics

      Pushpalatha C. BhatFermi National Accelerator Laboratory, Batavia, Illinois 60510; email: [email protected]
      Annual Review of Nuclear and Particle Science Vol. 61: 281 - 309
      • ...The interested reader may consult many excellent books (37, 38, 39, 40, 41) for details about these methods and algorithms....
    • Machine Learning for Detection and Diagnosis of Disease

      Paul SajdaDepartment of Biomedical Engineering, Columbia University, New York, NY 10027; email: [email protected]
      Annual Review of Biomedical Engineering Vol. 8: 537 - 565
      • ...Much of the original excitement for the application of machine learning to biomedicine originated from the development of artificial neural networks (ANNs) (e.g., see 3), ...

  • 16. 
    Collobert R, Weston J. 2008. A unified architecture for natural language processing: deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning, pp. 160–67. New York: ACM
    • Google Scholar
    Article Location
  • 17. 
    Sutskever I, Martens J, Hinton GE. 2011. Generating text with recurrent neural networks. In Proceedings of the 28th International Conference on Machine Learning, pp. 1017–24. New York: ACM
    • Google Scholar
    Article Location
  • 18. 
    Hinton GE, Deng L, Yu D, Dahl GE, Mohamed A, et al. 2012. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Proc. Mag. 29:82–97
    • Crossref
    • Web of Science ®
    • Google Scholar
    Article Location
    More AR articles citing this reference

    • Big Data and Artificial Intelligence Modeling for Drug Discovery

      Hao ZhuDepartment of Chemistry and Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey 08102, USA; email: [email protected]
      Annual Review of Pharmacology and Toxicology Vol. 60: 573 - 589
      • ...with many hidden layers were developed to address challenging questions such as speech recognition (101)....
    • Personal Sensing: Understanding Mental Health Using Ubiquitous Sensors and Machine Learning

      David C. Mohr,1 Mi Zhang,2 and Stephen M. Schueller11Center for Behavioral Intervention Technologies and Department of Preventive Medicine, Northwestern University, Chicago, Illinois 60611; email: [email protected], [email protected]2Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824; email: [email protected]
      Annual Review of Clinical Psychology Vol. 13: 23 - 47
      • ...such as identifying objects in images (Krizhevsky et al. 2012), recognizing speech (Hinton et al. 2012), ...
    • Learning Deep Generative Models

      Ruslan SalakhutdinovDepartments of Computer Science and Statistical Sciences, University of Toronto, Toronto M5S 3G4, Canada; email: [email protected]
      Annual Review of Statistics and Its Application Vol. 2: 361 - 385
      • ...substantially improving upon the current state of the art (Hinton et al. 2012, Krizhevsky et al. 2012)....
      • ...visual object recognition (Krizhevsky et al. 2012, Lee et al. 2009, Ranzato et al. 2008), speech recognition (Hinton et al. 2012, Mohamed et al. 2012), ...

  • 19. 
    Szegedy C, Toshev A, Erhan D. 2013. Deep neural networks for object detection. In Proceedings of the 26th Neural Information Processing Systems Conference (NIPS 2013), ed. CJC Burges, L Bottou, M Welling, Z Ghahramani, KQ Weinberger, pp. 2553–61. https://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection
    • Google Scholar
    Article Location
  • 20. 
    Taigman Y, Yang M, Ranzato M, Wolf L. 2014. DeepFace: closing the gap to human-level performance in face verification. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–8. Washington, DC: IEEE
    • Google Scholar
    Article Location
  • 21. 
    Zhang J, Zong C. 2015. Deep neural networks in machine translation: an overview. IEEE Intell. Syst. 30:16–25
    • Crossref
    • Web of Science ®
    • Google Scholar
    Article Location
  • 22. 
    Karpathy A, Li F. 2015. Deep visual–semantic alignments for generating image descriptions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–37. Washington, DC: IEEE
    • Google Scholar
    Article Location
  • 23. 
    Silver D, Huang A, Maddison CJ, Guez A, Sifre L, et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529:484–89
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
    More AR articles citing this reference

    • Machine Learning for Sustainable Energy Systems

      Priya L. Donti1,2 and J. Zico Kolter1,31Computer Science Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA; email: [email protected]2Department of Engineering and Public Policy, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA3Bosch Center for Artificial Intelligence, Pittsburgh, Pennsylvania 15222, USA
      Annual Review of Environment and Resources Vol. 46: 719 - 747
      • ...While RL has had some notable successes, such as beating humans in complex games like Go (19), ...
    • Data Science in Chemical Engineering: Applications to Molecular Science

      Chowdhury Ashraf,1 Nisarg Joshi,1 David A.C. Beck,1,2 and Jim Pfaendtner11Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, USA; email: [email protected], [email protected]2eScience Institute, University of Washington, Seattle, Washington 98195, USA
      Annual Review of Chemical and Biomolecular Engineering Vol. 12: 15 - 37
      • ...which provides several approaches for this gradient-based optimization problem, such as Q-learning (120) and policy gradients (121)....
    • Integrated Task and Motion Planning

      Caelan Reed Garrett, Rohan Chitnis, Rachel Holladay, Beomjoon Kim, Tom Silver, Leslie Pack Kaelbling, and Tomás Lozano-PérezComputer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; email: [email protected]
      Annual Review of Control, Robotics, and Autonomous Systems Vol. 4: 265 - 293
      • ...Learning search guidance has been hugely influential in games such as Go (123)....
    • Opportunities and Challenges for Machine Learning in Materials Science

      Dane Morgan and Ryan JacobsDepartment of Materials Science and Engineering, University of Wisconsin–Madison, Madison, Wisconsin 53706, USA; email: [email protected], [email protected]
      Annual Review of Materials Research Vol. 50: 71 - 103
      • ...with a series of stunning achievements that have been widely reported, including ML algorithms exhibiting superhuman capability at chess (1), Go (2), ...
    • Statistical Mechanics of Deep Learning

      Yasaman Bahri,1 Jonathan Kadmon,2 Jeffrey Pennington,1 Sam S. Schoenholz,1 Jascha Sohl-Dickstein,1 and Surya Ganguli1,21Google Brain, Google Inc., Mountain View, California 94043, USA2Department of Applied Physics, Stanford University, Stanford, California 94035, USA; email: [email protected]
      Annual Review of Condensed Matter Physics Vol. 11: 501 - 528
      • ...including machine vision (2), speech recognition (3), natural language processing (4), reinforcement learning (5), ...
    • Big Data and Artificial Intelligence Modeling for Drug Discovery

      Hao ZhuDepartment of Chemistry and Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey 08102, USA; email: [email protected]
      Annual Review of Pharmacology and Toxicology Vol. 60: 573 - 589
      • ...which has long been viewed as the most challenging of the classic games for AI (102)....
    • Machine Learning for Fluid Mechanics

      Steven L. Brunton,1 Bernd R. Noack,2,3 and Petros Koumoutsakos41Department of Mechanical Engineering, University of Washington, Seattle, Washington 98195, USA2LIMSI (Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur), CNRS UPR 3251, Université Paris-Saclay, F-91403 Orsay, France3Institut für Strömungsmechanik und Technische Akustik, Technische Universität Berlin, D-10634 Berlin, Germany4Computational Science and Engineering Laboratory, ETH Zurich, CH-8092 Zurich, Switzerland; email: [email protected]
      Annual Review of Fluid Mechanics Vol. 52: 477 - 508
      • ... and the AI gym (Mnih et al. 2015, Silver et al. 2016)....
    • Deep Learning: The Good, the Bad, and the Ugly

      Thomas SerreDepartment of Cognitive Linguistic and Psychological Sciences, Carney Institute for Brain Science, Brown University, Providence, Rhode Island 02818, USA; email: [email protected]
      Annual Review of Vision Science Vol. 5: 399 - 426
      • ...challenging our superiority complex over machines: AI has now beaten the best human players at Atari games (Mnih et al. 2015), Go (Silver et al. 2016), ...
    • Scientific Discovery Games for Biomedical Research

      Rhiju Das,1 Benjamin Keep,2Peter Washington,3 and Ingmar H. Riedel-Kruse31Department of Biochemistry and Department of Physics, Stanford University, Stanford, California 94305, USA; email: [email protected]2Department of Learning Sciences, Stanford University, Stanford, California 94305, USA3Department of Bioengineering, Stanford University, Stanford, California 94305, USA; email: [email protected]
      Annual Review of Biomedical Data Science Vol. 2: 253 - 279
      • ...e.g., classic Atari video games (24) and the board game Go (26, 27)....
    • A Tour of Reinforcement Learning: The View from Continuous Control

      Benjamin RechtDepartment of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720, USA; email: [email protected]
      Annual Review of Control, Robotics, and Autonomous Systems Vol. 2: 253 - 279
      • ...it should be impossible for a control engineer not to be impressed by the recent successes of the RL community, such as solving Go (1)....
      • ...The goal now is to analyze the features x and then subsequently choose a policy that emits u so that r is large.1 There are an endless number of problems where this formulation is applied (3, 9, 10), from online decision-making in games (1, 11...
    • Systems Pharmacology: Defining the Interactions of Drug Combinations

      J.G. Coen van Hasselt1,2 and Ravi Iyengar11Department of Pharmacological Sciences, Systems Biology Center, Mount Sinai Institute for Systems Biomedicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; email: [email protected]2Division of Systems Biomedicine and Pharmacology, Leiden Academic Center for Drug Research, Leiden University, 2333 Leiden, Netherlands; email: [email protected]
      Annual Review of Pharmacology and Toxicology Vol. 59: 21 - 40
      • ...Developments in the field of machine learning and artificial intelligence–based learning approaches are occurring at a brisk pace (126)....
    • Hyperspectral Sensors and Imaging Technologies in Phytopathology: State of the Art

      A.-K. Mahlein,1 M.T. Kuska,2 J. Behmann,2 G. Polder,3 and A. Walter41Institute of Sugar Beet Research (IfZ), 37079 Göttingen, Germany; email: [email protected]2Institute of Crop Science and Resource Conservation (INRES)–Plant Diseases and Plant Protection, University of Bonn, 53115 Bonn, Germany3Greenhouse Horticulture, Wageningen University and Research, 6708PB Wageningen, Netherlands4Institute of Agricultural Sciences, ETH Zürich, 8092 Zürich, Switzerland
      Annual Review of Phytopathology Vol. 56: 535 - 558
      • ...Specific deep neural networks have been developed for speech recognition (recurrent neural networks), translation (encoder-decoder), and board games like the popular AlphaGo (126)....
    • The Role of Variability in Motor Learning

      Ashesh K. Dhawale,1,2 Maurice A. Smith,2,3 and Bence P. Ölveczky1,21Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138; email: [email protected]2Center for Brain Science, Harvard University, Cambridge, Massachusetts 021383John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138
      Annual Review of Neuroscience Vol. 40: 479 - 498
      • ...was created using this general approach by first imitating expert human players and then learning by reinforcement from playing against itself (Silver et al. 2016)....
    • Neural Circuitry of Reward Prediction Error

      Mitsuko Watabe-Uchida,1, Neir Eshel,1,2, and Naoshige Uchida11Department of Molecular and Cellular Biology, Center for Brain Science, Harvard University, Cambridge, Massachusetts 02138; email: [email protected], [email protected]2Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, California 94305; email: [email protected]
      Annual Review of Neuroscience Vol. 40: 373 - 394
      • ...such as the one that recently mastered the game of Go (Silver et al. 2016)....

  • 24. 
    Russakovsky O, Deng J, Su H, Krause J, Satheesh S, et al. 2015. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115:211–52
    • Crossref
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    More AR articles citing this reference

    • Machine Learning for the Study of Plankton and Marine Snow from Images

      Jean-Olivier Irisson,1 Sakina-Dorothée Ayata,1 Dhugal J. Lindsay,2 Lee Karp-Boss,3 and Lars Stemmann11Laboratoire d'Océanographie de Villefranche, Sorbonne Université, CNRS, F-06230 Villefranche-sur-Mer, France; email: [email protected], [email protected], [email protected]2Advanced Science-Technology Research (ASTER) Program, Institute for Extra-Cutting-Edge Science and Technology Avant-Garde Research (X-STAR), Japan Agency for Marine-Earth Science and Technology, Yokosuka, Kanagawa 237-0021, Japan; email: [email protected]3School of Marine Sciences, University of Maine, Orono, Maine 04469, USA; email: [email protected]
      Annual Review of Marine Science Vol. 14: 277 - 301
      • ...Deep learning for image classification is based on convolutional neural networks (CNNs) (Krizhevsky et al. 2012, Russakovsky et al. 2015)....
    • Quantifying Visual Image Quality: A Bayesian View

      Zhengfang Duanmu,1 Wentao Liu,1 Zhongling Wang,1 and Zhou Wang11Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada; email: [email protected], [email protected], [email protected], [email protected]
      Annual Review of Vision Science Vol. 7: 437 - 464
      • ... as the auxiliary task, where abundant subject annotations existed (Russakovsky et al. 2015)....
    • Face Recognition by Humans and Machines: Three Fundamental Advances from Deep Learning

      Alice J. O'Toole1 and Carlos D. Castillo21School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas 75080, USA; email: [email protected]2Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA; email: [email protected]
      Annual Review of Vision Science Vol. 7: 543 - 570
      • ...Pretraining is carried out using large data sets of objects, such as those available in ImageNet (Russakovsky et al. 2015), ...
    • Machine Learning for Social Science: An Agnostic Approach

      Justin Grimmer,1 Margaret E. Roberts,2 and Brandon M. Stewart31Department of Political Science and Hoover Institution, Stanford University, Stanford, California 94305, USA; email: [email protected]2Department of Political Science and Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, California 92093, USA; email: [email protected]3Department of Sociology and Office of Population Research, Princeton University, Princeton, New Jersey 08540, USA; email: [email protected]
      Annual Review of Political Science Vol. 24: 395 - 419
      • ...provided a comparable target metric that demonstrated the transformative potential of deep convolutional neural networks (Russakovsky et al. 2015)....
    • Deep Learning: The Good, the Bad, and the Ugly

      Thomas SerreDepartment of Cognitive Linguistic and Psychological Sciences, Carney Institute for Brain Science, Brown University, Providence, Rhode Island 02818, USA; email: [email protected]
      Annual Review of Vision Science Vol. 5: 399 - 426
      • ...achieved a top-5 accuracy (response considered correct if the correct class label is included in the top 5 network outputs; chance level: 5/1000) of 83.6%—outperforming the second-best system by over 10% and cutting down the error rate compared to the previous year by over 60% (Russakovsky et al. 2015)....
      • ...training in this way requires very large image databases such as ILSVRC (with over 1 million samples for 1,000 object categories) (Russakovsky et al. 2015)....
      • ...estimated to have an error rate of approximately 5% (Russakovsky et al. 2015) (although the human accuracy measure was computed casually and should not be taken too literally)....
    • Deep Learning and Its Application to LHC Physics

      Dan Guest,1 Kyle Cranmer,2 and Daniel Whiteson11Department of Physics and Astronomy, University of California, Irvine, California 92697, USA2Physics Department, New York University, New York, NY 10003, USA
      Annual Review of Nuclear and Particle Science Vol. 68: 161 - 181
      • ...when a convergence of techniques enabled training of very large neural networks that greatly outperformed the previous state of the art (2...
      • ...Regularization techniques such as dropout were key to advances in image recognition with deep learning (2, 3)....
    • Planning and Decision-Making for Autonomous Vehicles

      Wilko Schwarting,1 Javier Alonso-Mora,2 and Daniela Rus11Computer Science and Artificial Intelligence Laboratory, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; email: [email protected], [email protected]2Department of Cognitive Robotics, Delft University of Technology, 2628 Delft, The Netherlands; email: [email protected]
      Annual Review of Control, Robotics, and Autonomous Systems Vol. 1: 187 - 210
      • ...The current state of the art for object recognition may be found in the corresponding benchmarks, such as the ImageNet Large Scale Visual Recognition Challenge (55)....
    • Visual Object Recognition: Do We (Finally) Know More Now Than We Did?

      Isabel Gauthier1 and Michael J. Tarr21Department of Psychology, Vanderbilt University, Nashville, Tennessee 37240-7817; email: [email protected]2Department of Psychology, Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213
      Annual Review of Vision Science Vol. 2: 377 - 396
      • ... or a model is an artificial vision system intended to perform some visual task (e.g., Russakovsky et al. 2015)....
      • ...computer vision researchers are now able to deploy highly complex and high-dimensional analyses across millions of discrete images in tractable amounts of time in useable systems (Russakovsky et al. 2015). ...
      • ...a comparison between the performance of the best model from the ImageNet Large Scale Visual Recognition Challenge in 2014 and human visual classification using the same images showed that humans outperformed the model only by 1.7% (Russakovsky et al. 2015)....

  • 25. 
    Everingham M, Van Gool L, Williams CKI, Winn J, Zisserman A. 2012. The PASCAL Visual Object Classes Challenge 2012 (VOC2012) results. http://host.robots.ox.ac.uk/pascal/VOC/voc2012/
    • Google Scholar
    Article Location
  • 26. 
    Zhang W, Li R, Deng H, Wang L, Lin W, et al. 2015. Deep convolutional neural networks for multi-modality isointense infant brain image segmentation. NeuroImage 108:214–24
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
  • 27. 
    Kleesiek J, Urban G, Hubert A, Schwarz D, Maier-Hein K, et al. 2016. Deep MRI brain extraction: a 3D convolutional neural network for skull stripping. NeuroImage 129:460–69
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
  • 28. 
    Wu G, Kim M, Wang Q, Munsell BC, Shen D. 2016. Scalable high-performance image registration framework by unsupervised deep feature representations learning. IEEE Trans. Biomed. Eng. 63:1505–16
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
  • 29. 
    Suk HI, Lee SW, Shen D. 2014. Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis. NeuroImage 101:569–82
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
    • Article Location
    • Article Location
  • 30. 
    Shin H, Roberts K, Lu L, Demner-Fushman D, Yao J, Summers RM. 2016. Learning to read chest X-rays: recurrent neural cascade model for automated image annotation. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2497–506. Washington, DC: IEEE
    • Google Scholar
    Article Location
  • 31. 
    Suk HI, Lee SW, Shen D. 2015. Latent feature representation with stacked auto-encoder for AD/MCI diagnosis. Brain Struct. Funct. 220:841–59
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
  • 32. 
    Suk HI, Shen D. 2015. Deep learning in diagnosis of brain disorders. In Recent Progress in Brain and Cognitive Engineering, ed. SW Lee, HH Bülthoff, KR Müller, pp. 203–13. Berlin: Springer
    • Crossref
    • Google Scholar
    Article Location
  • 33. 
    Suk HI, Wee CY, Lee SW, Shen D. 2016. State-space model with deep learning for functional dynamics estimation in resting-state fMRI. NeuroImage 129:292–307
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
    • Article Location
    • Article Location
  • 34. 
    Pereira S, Pinto A, Alves V, Silva CA. 2016. Brain tumor segmentation using convolutional neural networks in MRI images. IEEE Trans. Med. Imaging 35:1240–51
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
  • 35. 
    van Tulder G, de Bruijne M. 2016. Combining generative and discriminative representation learning for lung CT analysis with convolutional restricted Boltzmann machines. IEEE Trans. Med. Imaging 35:1262–72
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
  • 36. 
    Dou Q, Chen H, Yu L, Zhao L, Qin J, et al. 2016. Automatic detection of cerebral microbleeds from MR images via 3D convolutional neural networks. IEEE Trans. Med. Imaging 35:1182–95
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
    • Article Location
  • 37. 
    Cireşan DC, Giusti A, Gambardella LM, Schmidhuber J. 2013. Mitosis detection in breast cancer histological images with deep neural networks. In Proceedings of the 2013 Medical Image Computing and Computer-Assisted Intervention Conference, pp. 411–18. Berlin: Springer
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
    • Article Location
  • 38. 
    Chen H, Dou Q, Wang X, Qin J, Heng PA. 2016. Mitosis detection in breast cancer histology images via deep cascaded networks. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, pp. 1167–73. Palo Alto, CA: AAAI
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
  • 39. 
    Cheng JZ, Ni D, Chou YH, Qin J, Tiu CM, et al. 2016. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci. Rep. 6:24454
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
  • 40. 
    Roth HR, Lu L, Liu J, Yao J, Seff A, et al. 2016. Improving computer-aided detection using convolutional neural networks and random view aggregation. IEEE Trans. Med. Imaging 35:1170–81
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
    • Article Location
  • 41. 
    Shen W, Zhou M, Yang F, Yang C, Tian J. 2015. Multi-scale convolutional neural networks for lung nodule classification. In Lecture Notes in Computer Science, vol. 9123: Information Processing in Medical Imaging, pp. 588–99. Berlin: Springer
    • Crossref
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    More AR articles citing this reference

    • Deep Learning in Biomedical Data Science

      Pierre BaldiDepartment of Computer Science, Institute for Genomics and Bioinformatics, and Center for Machine Learning and Intelligent Systems, University of California, Irvine, California 92697, USA; email: [email protected]
      Annual Review of Biomedical Data Science Vol. 1: 181 - 205
      • ...coarse-to-fine cascade framework to refine the candidate lesions from the first tier for sclerotic spine metastases detection in computed tomography (CT) images; in Reference 130, ...

  • 42. 
    Setio AAA, Ciompi F, Litjens G, Gerke P, Jacobs C, et al. 2016. Pulmonary nodule detection in CT images: false positive reduction using multi-view convolutional networks. IEEE Trans. Med. Imaging 35:1160–69
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
  • 43. 
    Ciompi F, de Hoop B, van Riel SJ, Chung K, Scholten ET, et al. 2015. Automatic classification of pulmonary peri-fissural nodules in computed tomography using an ensemble of 2D views and a convolutional neural network out-of-the-box. Med. Image Anal. 26:195–202
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
  • 44. 
    Li R, Zhang W, Suk HI, Wang L, Li J, et al. 2014. Deep learning based imaging data completion for improved brain disease diagnosis. In Proceedings of the 2014 Medical Image Computing and Computer-Assisted Intervention Conference, pp. 305–12. Berlin: Springer
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
  • 45. 
    Shin HC, Roth HR, Gao M, Lu L, Xu Z, et al. 2016. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging 35:1285–98
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    More AR articles citing this reference

    • Computational and Informatic Advances for Reproducible Data Analysis in Neuroimaging

      Russell A. Poldrack,1 Krzysztof J. Gorgolewski,1 and Gaël Varoquaux21Department of Psychology, Stanford University, Stanford, California 94305, USA; email: [email protected]2Parietal Team, Inria and NeuroSpin/CEA (Atomic Energy Commission), 91191 Gif/-sur-Yvette, France
      Annual Review of Biomedical Data Science Vol. 2: 119 - 138
      • ...larger databases of brain images have enabled the training of richer machine learning models that have led to improved segmentation of brain structures (54, 55)....

  • 46. 
    Gupta A, Ayhan M, Maida A. 2013. Natural image bases to represent neuroimaging data. In Proceedings of the 30th International Conference on Machine Learning, pp. 987–94. New York: ACM
    • Google Scholar
    Article Location
  • 47. 
    Brosch T, Tam R. 2013. Manifold learning of brain MRIs by deep learning. In Proceedings of the 2013 Medical Image Computing and Computer-Assisted Intervention Conference, pp. 633–40. Berlin: Springer
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
  • 48. 
    Nie D, Wang L, Gao Y, Shen D. 2016. Fully convolutional networks for multi-modality isointense infant brain image segmentation. In Proceedings of the 13th IEEE International Symposium on Biomedical Imaging, pp. 1342–45. Washington, DC: IEEE
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    • Article Location
    • Article Location
  • 49. 
    Brosch T, Tang LYW, Yoo Y, Li DKB, Traboulsee A, Tam R. 2016. Deep 3D convolutional encoder networks with shortcuts for multiscale feature integration applied to multiple sclerosis lesion segmentation. IEEE Trans. Med. Imaging 35:1229–39
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
  • 50. 
    Chen H, Dou Q, Wang X, Qin J, Heng P. 2016. Mitosis detection in breast cancer histological images via deep cascaded networks. In Proceedings of the 30th AAAI Conference on Artificial Intelligence, pp. 1160–66. Palo Alto, CA: AAAI
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
  • 51. 
    Shin HC, Orton MR, Collins DJ, Doran SJ, Leach MO. 2013. Stacked autoencoders for unsupervised feature learning and multiple organ detection in a pilot study using 4D patient data. IEEE Trans. Pattern Anal. Mach. Intell. 35:1930–43
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
  • 52. 
    Wu G, Kim M, Wang Q, Gao Y, Liao S, Shen D. 2013. Unsupervised deep feature learning for deformable registration of MR brain images. In Proceedings of the 2013 Medical Image Computing and Computer-Assisted Intervention Conference, pp. 649–56. Berlin: Springer
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
  • 53. 
    Su H, Xing F, Kong X, Xie Y, Zhang S, Yang L. 2015. Robust cell detection and segmentation in histopathological images using sparse reconstruction and stacked denoising autoencoders. In Proceedings of the 2015 Medical Image Computing and Computer-Assisted Intervention Conference, pp. 383–90. Berlin: Springer
    • Crossref
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
  • 54. 
    Xu J, Xiang L, Liu Q, Gilmore H, Wu J, et al. 2016. Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images. IEEE Trans. Med. Imaging 35:119–30
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Locations:
    • Article Location
    • Article Location
    • Article Location
    More AR articles citing this reference

    • Sparse Data–Driven Learning for Effective and Efficient Biomedical Image Segmentation

      John A. Onofrey,1,2 Lawrence H. Staib,1,3 Xiaojie Huang,1,6 Fan Zhang,1 Xenophon Papademetris,1,3 Dimitris Metaxas,4 Daniel Rueckert,5 and James S. Duncan1,31Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, Connecticut 06520, USA; email: [email protected]2Department of Urology, Yale School of Medicine, New Haven, Connecticut 06520, USA3Department of Biomedical Engineering, Yale University, New Haven, Connecticut 06520, USA; email: [email protected]4Department of Computer Science, Rutgers University, Piscataway, New Jersey 08854, USA5Department of Computing, Imperial College London, London SW7 2AZ, United Kingdom6Citadel Securities, Chicago, Illinois 60603, USA
      Annual Review of Biomedical Engineering Vol. 22: 127 - 153
      • ...The resulting representation can then be used, for example, as input to a classifier for segmentation (66)....

  • 55. 
    Salakhutdinov R. 2015. Learning deep generative models. Annu. Rev. Stat. Appl. 2:361–85
    • Link
    • Web of Science ®
    • Google Scholar
  • 56. 
    Munsell BC, Wee CY, Keller SS, Weber B, Elger C, et al. 2015. Evaluation of machine learning algorithms for treatment outcome prediction in patients with epilepsy based on structural connectome data. NeuroImage 118:219–30
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
  • 57. 
    Maier O, Schrder C, Forkert ND, Martinetz T, Handels H. 2015. Classifiers for ischemic stroke lesion segmentation: a comparison study. PLOS ONE 10:1–16
    • Crossref
    • Web of Science ®
    • Google Scholar
    Article Location
  • 58. 
    Havaei M, Davy A, Warde-Farley D, Biard A, Courville A, et al. 2017. Brain tumor segmentation with deep neural networks. Med. Image Anal. 35:18–31
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
    More AR articles citing this reference

    • Autonomy in Surgical Robotics

      Aleks Attanasio,1 Bruno Scaglioni,1 Elena De Momi,2 Paolo Fiorini,3 and Pietro Valdastri11School of Electronic and Electrical Engineering, University of Leeds, Leeds LS2 9JT, United Kingdom; email: [email protected], [email protected], [email protected]2Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, 20133 Milano, Italy; email: [email protected]3Dipartimento di Informatica, Università di Verona, 37134 Verona, Italy; email: [email protected]
      Annual Review of Control, Robotics, and Autonomous Systems Vol. 4: 651 - 679
      • ...Autonomous segmentation of organs such as the brain (131), liver (132), and prostate (133)...

  • 59. 
    Ronneberger O, Fischer P, Brox T. 2015. U-net: convolutional networks for biomedical image segmentation. In Proceedings of the 2015 Medical Image Computing and Computer-Assisted Intervention Conference, pp. 234–41. Berlin: Springer
    • Google Scholar
    Article Location
  • 60. 
    Fakhry A, Peng H, Ji S. 2016. Deep models for brain EM image segmentation: novel insights and improved performance. Bioinformatics 32:2352–58
    • Google Scholar
    Article Location
  • 61. 
    Farag A, Lu L, Roth HR, Liu J, Turkbey E, Summers RM. 2015. A bottom-up approach for pancreas segmentation using cascaded superpixels and (deep) image patch labeling. arXiv:1505.06236 [cs.CV]
    • Google Scholar
    Article Location
  • 62. 
    Ghesu FC, Krubasik E, Georgescu B, Singh V, Zheng Y, et al. 2016. Marginal space deep learning: efficient architecture for volumetric image parsing. IEEE Trans. Med. Imaging 35:1217–28
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
  • 63. 
    Wang CW, Huang CT, Lee JH, Li CH, Chang SW, et al. 2016. A benchmark for comparison of dental radiography analysis algorithms. Med. Image Anal. 31:63–76
    • Crossref
    • Medline
    • Web of Science ®
    • Google Scholar
    Article Location
  • 64. 
    Rosenblatt F. 1958. The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 1958:65–386
    • Google Scholar
    Article Location
  • 65. 
    Rumelhart DE, Hinton GE, Williams RJ. 1986. Learning representations by back-propagating errors. Nature 323:533–36
    • Crossref
    • Web of Science ®
    • Google Scholar
    Article Location
    More AR articles citing this reference

    • Exploring Cognition with Brain–Machine Interfaces

      Richard A. Andersen,1,2 Tyson Aflalo,1 Luke Bashford,1 David Bjånes,1 and Spencer Kellis1,2,31Division of Biology and Biological Engineering and Tianqiao & Chrissy Chen Brain-Machine Interface Center, California Institute of Technology, Pasadena, California 91125, USA; email: [email protected]2USC Neurorestoration Center, Keck School of Medicine of USC, Los Angeles, California 90033, USA3Department of Neurological Surgery, Keck School of Medicine of USC, Los Angeles, California 90033, USA
      Annual Review of Psychology Vol. 73: 131 - 158
      • ...three-layer neural networks and back-propagation learning (the precursors of modern artificial intelligence and machine learning) were just developed (Rumelhart et al. 1986)....
    • Quantitative Molecular Positron Emission Tomography Imaging Using Advanced Deep Learning Techniques

      Habib Zaidi1,2,3,4 and Issam El Naqa5,6,71Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, 1211 Geneva, Switzerland; email: [email protected]2Geneva Neuroscience Centre, University of Geneva, 1205 Geneva, Switzerland3Department of Nuclear Medicine and Molecular Imaging, University of Groningen, 9700 RB Groningen, Netherlands4Department of Nuclear Medicine, University of Southern Denmark, DK-5000 Odense, Denmark5Department of Machine Learning, Moffitt Cancer Center, Tampa, Florida 33612, USA6Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan 48109, USA7Department of Oncology, McGill University, Montreal, Quebec H3A 1G5, Canada
      Annual Review of Biomedical Engineering Vol. 23: 249 - 276
      • ...which are essential in the context of the lower image quality encountered in molecular imaging in comparison to other diagnostic imaging modalities, such as CT or MRI (41)....
    • Machine-Learning Quantum States in the NISQ Era

      Giacomo Torlai1 and Roger G. Melko2,31Center for Computational Quantum Physics, Flatiron Institute, New York, NY 10010, USA; email: [email protected]2Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada; email: [email protected]3Perimeter Institute for Theoretical Physics, Waterloo, Ontario N2L 2Y5, Canada
      Annual Review of Condensed Matter Physics Vol. 11: 325 - 344
      • ...The first resurgence of the field took place more than a decade later, with the invention of the backpropagation algorithm (13)...
    • Machine Learning for Fluid Mechanics

      Steven L. Brunton,1 Bernd R. Noack,2,3 and Petros Koumoutsakos41Department of Mechanical Engineering, University of Washington, Seattle, Washington 98195, USA2LIMSI (Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur), CNRS UPR 3251, Université Paris-Saclay, F-91403 Orsay, France3Institut für Strömungsmechanik und Technische Akustik, Technische Universität Berlin, D-10634 Berlin, Germany4Computational Science and Engineering Laboratory, ETH Zurich, CH-8092 Zurich, Switzerland; email: [email protected]
      Annual Review of Fluid Mechanics Vol. 52: 477 - 508
      • ...came in the late 1980s with the development of the backpropagation algorithm (Rumelhart et al. 1986)....
      • ...Nonlinear optimization methods, such as backpropagation (Rumelhart et al. 1986), are used to identify the network weights to minimize the error between the prediction and labeled training data....
    • Machine Learning Methods That Economists Should Know About

      Susan Athey1,2,3 and Guido W. Imbens1,2,3,41Graduate School of Business, Stanford University, Stanford, California 94305, USA; email: [email protected], [email protected]2Stanford Institute for Economic Policy Research, Stanford University, Stanford, California 94305, USA3National Bureau of Economic Research, Cambridge, Massachusetts 02138, USA4Department of Economics, Stanford University, Stanford, California 94305, USA
      Annual Review of Economics Vol. 11: 685 - 725
      • ...The algorithms of choice use the back-propagation algorithm and variations thereon (Rumelhart et al. 1986) to calculate the exact derivatives with respect to the parameters of the unit-level terms in the objective function....
    • Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing

      Nikolaus KriegeskorteMedical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, United Kingdom; email: [email protected]
      Annual Review of Vision Science Vol. 1: 417 - 446
      • ...a gradient-descent method that makes iterative small adjustments to the weights in order to reduce the errors of the outputs (Werbos 1981, Rumelhart et al. 1986)....
      • ...This gives the method its name, backpropagation (Werbos 1981, Rumelhart et al. 1986)....
    • Learning Deep Generative Models

      Ruslan SalakhutdinovDepartments of Computer Science and Statistical Sciences, University of Toronto, Toronto M5S 3G4, Canada; email: [email protected]
      Annual Review of Statistics and Its Application Vol. 2: 361 - 385
      • ...Local gradient-based optimization algorithms, such as the backpropagation algorithm (Rumelhart et al. 1986), ...
    • Bringing Machine Learning and Compositional Semantics Together

      Percy Liang1 and Christopher Potts2Departments of 1Computer Science and2Linguistics, Stanford University, Stanford, California 94305; email: [email protected], [email protected]
      Annual Review of Linguistics Vol. 1: 355 - 376
      • ...and the classic backpropagation algorithm provides a way to organize this computation (Rumelhart et al. 1986a,b)....
    • Synaptic Modification by Correlated Activity: Hebb's Postulate Revisited

      Guo-qiang Bi and Mu-ming PooDepartment of Molecular & Cell Biology, University of California at Berkeley, Berkeley, CA 94720-3200; email: [email protected]
      Annual Review of Neuroscience Vol. 24: 139 - 166
      • ...with which signals for synaptic modification can propagate backward through connections in multilayer networks (Rumelhart et al 1986a, b)....
    • PERCEPTUAL LEARNING

      Robert L. GoldstonePsychology Building, Indiana University, Bloomington, Indiana 47405; e-mail: [email protected]
      Annual Review of Psychology Vol. 49: 585 - 612
      • ...Neural networks have been particularly popular because they often possess hidden units that intervene between inputs and outputs and can be interpreted as developing internal representations of presented inputs (Rumelhart et al 1986)....
      • ...then the features that they develop can be tailored to these categories (Intrator 1994, Rumelhart et al 1986)....

  • 66. 
    Le QV, Ngiam J, Coates A, Lahiri A, Prochnow B, Ng AY. 2011. On optimization methods for deep learning. In Proceedings of the 28th International Conference on Machine Learning, pp. 265–72. New York: ACM
    • Google Scholar
    Article Location
  • 67. 
    Hornik K. 1991. Approximation capabilities of multilayer feedforward networks. Neural Netw. 4:251–57
    • Crossref
    • Web of Science ®
    • Google Scholar
    Article Location
    More AR articles citing this reference

    • Data-Driven Approaches to Understanding Visual Neuron Activity

      Daniel A. ButtsDepartment of Biology and Program in Neuroscience and Cognitive Science, University of Maryland, College Park, Maryland 20742, USA; email: [email protected]
      Annual Review of Vision Science Vol. 5: 451 - 477
      • ...a general result from neural network approaches developed in the 1990s is that any nonlinear function can be represented by an LNLN cascade with appropriate subunit nonlinearities (Cybenko 1989, Hornik 1991, Kriegeskorte 2015)....
      • ...While LNLN cascades can in principle be used to describe any high-dimensional nonlinear function (Cybenko 1989, Hornik 1991, Kriegeskorte 2015), ...
    • Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing

      Nikolaus KriegeskorteMedical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, United Kingdom; email: [email protected]
      Annual Review of Vision Science Vol. 1: 417 - 446
      • ...Continuous functions can be approximated with arbitrary precision by adding a sufficient number of hidden units and suitably setting the weights (Schäfer & Zimmermann 2007, Hornik 1991, Cybenko 1989). Figure 2c illustrates this process for two-dimensional inputs: Adding up a sufficient number of sigmoid ramps, ...

  • 68. 
    Schwarz G. 1978. Estimating the dimension of a model. Ann. Stat. 6:461–64
    • Crossref
    • Web of Science ®
    • Google Scholar
    Article Location
    More AR articles citing this reference

    • Statistical Connectomics

      Jaewon Chung,1 Eric Bridgeford,2 Jesús Arroyo,3 Benjamin D. Pedigo,1 Ali Saad-Eldin,1 Vivek Gopalakrishnan,1 Liang Xiang,1 Carey E. Priebe,4 and Joshua T. Vogelstein51Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA; email: [email protected]2Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland 21218, USA3Center for Imaging Science, Johns Hopkins University, Baltimore, Maryland 21218, USA4Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, Maryland 21218, USA5Department of Biomedical Engineering, Institute for Computational Medicine, Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, Maryland 21218, USA; email: [email protected]
      Annual Review of Statistics and Its Application Vol. 8: 463 - 492
      • ...Akaike information criterion, and minimum description length (Akaike 1974, Rissanen 1978, Schwarz et al. 1978)....
    • A Survey of Tuning Parameter Selection for High-Dimensional Regression

      Yunan Wu and Lan WangSchool of Statistics, University of Minnesota, Minneapolis, Minnesota 55455, USA; email: [email protected]
      Annual Review of Statistics and Its Application Vol. 7: 209 - 226
      • ...Given the data (), the classical BIC, proposed by Schwarz (1978), ...
    • System Identification: A Machine Learning Perspective

      A. Chiuso and G. PillonettoDepartment of Information Engineering, University of Padova, 35131 Padova, Italy; email: [email protected]
      Annual Review of Control, Robotics, and Autonomous Systems Vol. 2: 281 - 304
      • ...This selection can be accomplished by means of classical criteria such as the Akaike or Bayesian information criterion (1, 2)...
    • Finite Mixture Models

      Geoffrey J. McLachlan, Sharon X. Lee, and Suren I. RathnayakeSchool of Mathematics and Physics, University of Queensland, St. Lucia, Queensland 4072, Australia; email: [email protected]
      Annual Review of Statistics and Its Application Vol. 6: 355 - 378
      • ...the density estimate that uses the Bayesian information criterion (BIC) of Schwarz (1978)...
      • ...The main Bayesian-based information criteria use an approximation to the integrated likelihood, as in the original proposal by Schwarz (1978), ...
    • Modeling Health Care Expenditures and Use

      Partha Deb1 and Edward C. Norton21Department of Economics, Hunter College, City University of New York, New York, NY 10065, USA; and National Bureau of Economic Research; email: [email protected]2Departments of Health Management and Policy and Economics, University of Michigan, Ann Arbor, Michigan 48109, USA; and National Bureau of Economic Research; email: [email protected]
      Annual Review of Public Health Vol. 39: 489 - 505
      • ... and the Bayesian Information Criterion (BIC), also known as the Schwarz Bayesian Criterion (SBC) (41), ...
    • Markov Chain Monte Carlo Methods for Bayesian Data Analysis in Astronomy

      Sanjib SharmaSydney Institute for Astronomy, School of Physics, University of Sydney, NSW 2006, Australia; email: [email protected]
      Annual Review of Astronomy and Astrophysics Vol. 55: 213 - 259
      • Statistical Model Choice

        Gerda ClaeskensResearch Center ORSTAT and Leuven Statistics Research Center, KU Leuven, B-3000 Leuven, Belgium; email: [email protected]
        Annual Review of Statistics and Its Application Vol. 3: 233 - 256
        • ... and the Bayesian information criterion (BIC; Schwarz 1978) are simple to use....
      • Breaking Bad: Two Decades of Life-Course Data Analysis in Criminology, Developmental Psychology, and Beyond

        Elena A. Erosheva,1,2 Ross L. Matsueda,3 and Donatello Telesca41Department of Statistics,2School of Social Work, and3Department of Sociology, University of Washington, Seattle, Washington 98195; email: [email protected], [email protected]4Department of Biostatistics, University of California, Los Angeles, California 90095; email: [email protected]
        Annual Review of Statistics and Its Application Vol. 1: 301 - 332
        • ...a conventional growth curve model with random intercept and slope provided the best fit as judged by the Bayesian information criterion (BIC) (Kass & Raftery 1995, Schwarz 1978)....
        • ...including penalized likelihood criteria such as the Akaike information criterion (AIC) (Akaike 1973) and the BIC (Schwarz 1978), ...
      • Group-Based Trajectory Modeling in Clinical Research

        Daniel S. Nagin1 and Candice L. Odgers21Heinz School of Public Policy, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213-3890;2School of Social Ecology, University of California, Irvine, California 92697-7050; email: [email protected]
        Annual Review of Clinical Psychology Vol. 6: 109 - 138
        • ...The most commonly used criteria to evaluate model fit include the Bayesian information criteria (BIC; Raftery 1995, Schwartz 1978), ...
      • Statistical Methods for Cosmological Parameter Selection and Estimation

        Andrew R. LiddleAstronomy Centre, University of Sussex, Brighton BN1 9QH, United Kingdom; email: [email protected]
        Annual Review of Nuclear and Particle Science Vol. 59: 95 - 114
        • ...Introduced by Schwarz (37), this approach approximates the Bayes factor, without requiring the models to be nested, ...
      • Model Selection in Phylogenetics

        Jack Sullivan1,2 and Paul Joyce2,31Department of Biological Sciences, University Idaho, Moscow, Idaho 83844-3051; email: [email protected] 2Initiative in Bioinformatics and Evolutionary Studies (IBEST), University of Idaho, Moscow, Idaho 83844 3Department of Mathematics, University of Idaho, Moscow, Idaho 83844-1103; email: [email protected]
        Annual Review of Ecology, Evolution, and Systematics Vol. 36: 445 - 466
        • ...BAYESIAN INFORMATION CRITERION An approximation of full Bayesian model evaluation was devised by Schwarz (1978): the Bayesian information criterion (BIC)....
      • Estimating Divergence Times from Molecular Data on Phylogenetic and Population Genetic Timescales

        Brian S. ArbogastDepartment of Biological Sciences, Humboldt State University, Arcata, California 95521; email: [email protected] Scott V. EdwardsDepartment of Zoology, University of Washington, Seattle, Washington 98195; email: [email protected] John WakeleyDepartment of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138; email: [email protected] Peter BeerliDepartment of Genome Sciences, University of Washington, Seattle, Washington 98195; email: [email protected] Joseph B. SlowinskiCalifornia Academy of Sciences, San Francisco, California
        Annual Review of Ecology and Systematics Vol. 33: 707 - 740
        • ...both the Akaike information criterion (Akaike 1974) (AIC) and the Bayesian information criterion (Schwartz 1978)...

    • 69. 
      Bourlard H, Kamp Y. 1988. Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybern. 59:291–94
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 70. 
      Bengio Y, Lamblin P, Popovici D, Larochelle H. 2007. Greedy layer-wise training of deep networks. In Proceedings of the 19th Conference on Neural Information Processing Systems (NIPS 2006), ed. B Schölkopf, JC Platt, T Hoffmann, pp. 153–60. https://papers.nips.cc/paper/3048-greedy-layer-wise-training-of-deep-networks
      • Google Scholar
      Article Location
    • 71. 
      Larochelle H, Bengio Y, Louradour J, Lamblin P. 2009. Exploring strategies for training deep neural networks. J. Mach. Learn. Res. 10:1–40
      • Web of Science ®
      • Google Scholar
      Article Locations:
      • Article Location
      • Article Location
      More AR articles citing this reference

      • Learning Deep Generative Models

        Ruslan SalakhutdinovDepartments of Computer Science and Statistical Sciences, University of Toronto, Toronto M5S 3G4, Canada; email: [email protected]
        Annual Review of Statistics and Its Application Vol. 2: 361 - 385
        • ...Many variants of this model have been successfully applied not only for classification tasks (Hinton et al. 2006, Bengio et al. 2007, Larochelle et al. 2009), ...
        • ...the CD learning with T = 1 (or CD1) has been shown to work quite well (Hinton 2002, Welling et al. 2005, Larochelle et al. 2009)....

    • 72. 
      Hinton GE, Osindero S, Teh YW. 2006. A fast learning algorithm for deep belief nets. Neural Comput. 18:1527–54
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
      More AR articles citing this reference

      • Machine-Learning Quantum States in the NISQ Era

        Giacomo Torlai1 and Roger G. Melko2,31Center for Computational Quantum Physics, Flatiron Institute, New York, NY 10010, USA; email: [email protected]2Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada; email: [email protected]3Perimeter Institute for Theoretical Physics, Waterloo, Ontario N2L 2Y5, Canada
        Annual Review of Condensed Matter Physics Vol. 11: 325 - 344
        • ...we point out the curious observation that the quality of the reconstruction does not obviously improve for deep versions of the RBMs (46), such as deep belief networks (47)...
      • Statistical Mechanics of Deep Learning

        Yasaman Bahri,1 Jonathan Kadmon,2 Jeffrey Pennington,1 Sam S. Schoenholz,1 Jascha Sohl-Dickstein,1 and Surya Ganguli1,21Google Brain, Google Inc., Mountain View, California 94043, USA2Department of Applied Physics, Stanford University, Stanford, California 94035, USA; email: [email protected]
        Annual Review of Condensed Matter Physics Vol. 11: 501 - 528
        • ...Extensive research has led to successively more sophisticated energy-based models (158...
      • Deep Learning and Its Application to LHC Physics

        Dan Guest,1 Kyle Cranmer,2 and Daniel Whiteson11Department of Physics and Astronomy, University of California, Irvine, California 92697, USA2Physics Department, New York University, New York, NY 10003, USA
        Annual Review of Nuclear and Particle Science Vol. 68: 161 - 181
        • ...and pretraining of initial embeddings with unsupervised learning methods such as autoencoders (18, 19)....
      • The Role of Variability in Motor Learning

        Ashesh K. Dhawale,1,2 Maurice A. Smith,2,3 and Bence P. Ölveczky1,21Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138; email: [email protected]2Center for Brain Science, Harvard University, Cambridge, Massachusetts 021383John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138
        Annual Review of Neuroscience Vol. 40: 479 - 498
        • ...Another key to the success of deep learning networks has been the use of unsupervised methods to pretrain networks based on the statistics of the input data (Hinton et al. 2006, LeCun et al. 2015, Lee et al. 2009)....
      • Machine Translation: Mining Text for Social Theory

        James A. Evans and Pedro AcevesDepartment of Sociology, University of Chicago, Chicago, Illinois 60637; email: [email protected]
        Annual Review of Sociology Vol. 42: 21 - 50
        • ...and video with deep learning approaches recently becoming dominant (Hinton et al. 2006, Hirschberg & Manning 2015)....
      • Learning Deep Generative Models

        Ruslan SalakhutdinovDepartments of Computer Science and Statistical Sciences, University of Toronto, Toronto M5S 3G4, Canada; email: [email protected]
        Annual Review of Statistics and Its Application Vol. 2: 361 - 385
        • ...Recently, Hinton et al. (2006) introduced a moderately fast, unsupervised learning algorithm for deep generative models called deep belief networks (DBNs)....
        • ...Many variants of this model have been successfully applied not only for classification tasks (Hinton et al. 2006, Bengio et al. 2007, Larochelle et al. 2009), ...
        • ...as shown in Figure 4. Hinton et al. (2006) introduced a fast unsupervised learning algorithm for these deep networks....
        • ...this is a promising way of solving object and speech recognition problems (Bengio 2009, Bengio & LeCun 2007, Hinton et al. 2006, Mohamed et al. 2012)....
        • ...achieved by the DBN described in articles by Hinton & Salakhutdinov (2006) and Hinton et al. (2006); and 0.97%, ...
      • Brain Plasticity Through the Life Span: Learning to Learn and Action Video Games

        Daphne Bavelier,1,2 C. Shawn Green,3 Alexandre Pouget,1,2 and Paul Schrater41Department of Psychology and Education Sciences, University of Geneva, 1211 Geneva 4, Switzerland2Department of Brain and Cognitive Sciences, University of Rochester, Rochester, New York 14627-0268, USA; email: [email protected], [email protected]3Department of Psychology, Eye Research Institute, University of Wisconsin, Madison, Wisconsin 53706, USA: email: [email protected]4Departments of Psychology and Computer Science, University of Minnesota, Minnesota 55455, USA; email: [email protected]
        Annual Review of Neuroscience Vol. 35: 391 - 416
        • ...algorithms such as those employed in Hinton's deep-belief networks have provided evidence that there is enough shared structure in the types of learning problems encountered by humans beings to allow for the use of flexible and generalizable learning algorithms (Hinton 2007, Hinton et al. 2006, Larochelle et al. 2007, Lee et al. 2009)....

    • 73. 
      Smolensky P. 1986. Information processing in dynamical systems: foundations of harmony theory. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition. pp. 194–281. Cambridge, MA: MIT Press
      • Google Scholar
      Article Locations:
      • Article Location
      • Article Location
    • 74. 
      Hinton GE. 2002. Training products of experts by minimizing contrastive divergence. Neural Comput. 14:1771–800
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
      More AR articles citing this reference

      • Statistical Mechanics of Deep Learning

        Yasaman Bahri,1 Jonathan Kadmon,2 Jeffrey Pennington,1 Sam S. Schoenholz,1 Jascha Sohl-Dickstein,1 and Surya Ganguli1,21Google Brain, Google Inc., Mountain View, California 94043, USA2Department of Applied Physics, Stanford University, Stanford, California 94035, USA; email: [email protected]
        Annual Review of Condensed Matter Physics Vol. 11: 501 - 528
        • ...These include exhaustive Monte Carlo, the contrastive divergence heuristic (163) and variants (164)...
      • Machine-Learning Quantum States in the NISQ Era

        Giacomo Torlai1 and Roger G. Melko2,31Center for Computational Quantum Physics, Flatiron Institute, New York, NY 10010, USA; email: [email protected]2Department of Physics and Astronomy, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada; email: [email protected]3Perimeter Institute for Theoretical Physics, Waterloo, Ontario N2L 2Y5, Canada
        Annual Review of Condensed Matter Physics Vol. 11: 325 - 344
        • ...the contrastive divergence (CD) between the data and the RBM after a sequence of k block Gibbs sampling steps is (24) 17. where is the probability distribution of the visible layer after k steps....
      • Representation Learning: A Statistical Perspective

        Jianwen Xie,1 Ruiqi Gao,2 Erik Nijkamp,2 Song-Chun Zhu,2 and Ying Nian Wu21Hikvision Research Institute, Santa Clara, California 95054, USA2Department of Statistics, University of California, Los Angeles, California 90095, USA; email: [email protected]
        Annual Review of Statistics and Its Application Vol. 7: 303 - 335
        • ...Inspired by the work of Hinton (2002), Han et al. (2019) called Equation 20 the adversarial contrastive divergence (ACD)....
        • ...Modified contrastive divergence for the energy-based model: In the traditional contrastive divergence (Hinton 2002), ...
      • Learning Deep Generative Models

        Ruslan SalakhutdinovDepartments of Computer Science and Statistical Sciences, University of Toronto, Toronto M5S 3G4, Canada; email: [email protected]
        Annual Review of Statistics and Its Application Vol. 2: 361 - 385
        • ...learning is done by following an approximation to the gradient of a different objective function, called the Contrastive Divergence (CD) algorithm (Hinton 2002):...
        • ...the CD learning with T = 1 (or CD1) has been shown to work quite well (Hinton 2002, Welling et al. 2005, Larochelle et al. 2009)....

    • 75. 
      Hinton G, Dayan P, Frey B, Neal R. 1995. The “wake–sleep” algorithm for unsupervised neural networks. Science 268:1158–61
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
      More AR articles citing this reference

      • Representation Learning: A Statistical Perspective

        Jianwen Xie,1 Ruiqi Gao,2 Erik Nijkamp,2 Song-Chun Zhu,2 and Ying Nian Wu21Hikvision Research Institute, Santa Clara, California 95054, USA2Department of Statistics, University of California, Los Angeles, California 90095, USA; email: [email protected]
        Annual Review of Statistics and Its Application Vol. 7: 303 - 335
        • ...The wake-sleep algorithm (Hinton et al. 1995) is similar to the VAE, ...
      • Internal Models in Biological Control

        Daniel McNamee1,2 and Daniel M. Wolpert1,31Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, United Kingdom; email: [email protected]2Institute of Neurology, University College London, London WC1E 6BT, United Kingdom3Zuckerman Mind Brain Behavior Institute, Department of Neuroscience, Columbia University, New York, NY 10027, USA; email: [email protected]
        Annual Review of Control, Robotics, and Autonomous Systems Vol. 2: 339 - 364
        • ...as well as a probabilistic internal model describing the dependency of sensory signals y on the latent state z, known as a generative model in computational neuroscience (12)....
      • Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing

        Nikolaus KriegeskorteMedical Research Council Cognition and Brain Sciences Unit, University of Cambridge, Cambridge CB2 7EF, United Kingdom; email: [email protected]
        Annual Review of Vision Science Vol. 1: 417 - 446
        • ...The learning can be performed using the wake–sleep algorithm (Hinton et al. 1995, Dayan 2003)....
      • Beyond Phrenology: What Can Neuroimaging Tell Us About Distributed Circuitry?

        Karl FristonThe Wellcome Department of Cognitive Neurology, University College London, Queen Square, London, WC1N 3BG United Kingdom; email: [email protected]
        Annual Review of Neuroscience Vol. 25: 221 - 250
        • ...Dayan et al. 1995, Hinton et al. 1995) to biologically plausible models of visual processing (e.g., ...
        • ...The goal of generative models is “to learn representations that are economical to describe but allow the input to be reconstructed accurately” (Hinton et al. 1995)....

    • 76. 
      Lecun Y, Bottou L, Bengio Y, Haffner P. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86:2278–324
      • Crossref
      • Web of Science ®
      • Google Scholar
      Article Location
      More AR articles citing this reference

      • Quantitative Molecular Positron Emission Tomography Imaging Using Advanced Deep Learning Techniques

        Habib Zaidi1,2,3,4 and Issam El Naqa5,6,71Division of Nuclear Medicine and Molecular Imaging, Geneva University Hospital, 1211 Geneva, Switzerland; email: [email protected]2Geneva Neuroscience Centre, University of Geneva, 1205 Geneva, Switzerland3Department of Nuclear Medicine and Molecular Imaging, University of Groningen, 9700 RB Groningen, Netherlands4Department of Nuclear Medicine, University of Southern Denmark, DK-5000 Odense, Denmark5Department of Machine Learning, Moffitt Cancer Center, Tampa, Florida 33612, USA6Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan 48109, USA7Department of Oncology, McGill University, Montreal, Quebec H3A 1G5, Canada
        Annual Review of Biomedical Engineering Vol. 23: 249 - 276
        • ...Lecun et al. (44) presented the first application of a CNN, ...
      • Electronic Skins for Healthcare Monitoring and Smart Prostheses

        Haotian Chen, Laurent Dejace, and Stéphanie P. LacourLaboratory for Soft Bioelectronic Interfaces, Institute of Microengineering, Institute of Bioengineering, and Centre for Neuroprosthetics, École Polytechnique Fédérale de Lausanne (EPFL), 1202 Geneva, Switzerland; email: [email protected]
        Annual Review of Control, Robotics, and Autonomous Systems Vol. 4: 629 - 650
        • ...When analyzed using convolutional neural networks (104), this set of interactions with different objects revealed the correspondence between different regions of the human hand during object manipulation, ...
      • Machine Learning for Molecular Simulation

        Frank Noé,1,2,3 Alexandre Tkatchenko,4 Klaus-Robert Müller,5,6,7 and Cecilia Clementi1,3,81Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany; email: [email protected]2Department of Physics, Freie Universität Berlin, 14195 Berlin, Germany3Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA; email: [email protected]4Physics and Materials Science Research Unit, University of Luxembourg, 1511 Luxembourg, Luxembourg; email: [email protected]5Department of Computer Science, Technical University Berlin, 10587 Berlin, Germany; email: [email protected]6Max-Planck-Institut für Informatik, 66123 Saarbrücken, Germany7Department of Brain and Cognitive Engineering, Korea University, Seoul 136-713, South Korea8Department of Physics, Rice University, Houston, Texas 77005, USA
        Annual Review of Physical Chemistry Vol. 71: 361 - 390
        • ...standard convolutions are translation invariant: Each convolution kernel is a feature detector that is applied to each pixel neighborhood (99)....
        • ...A decisive advance in the performance of neural networks for classical computer vision problems such as text or digit recognition came with going from fully connected dense networks to convolutional neural networks (99)....
      • Representation Learning: A Statistical Perspective

        Jianwen Xie,1 Ruiqi Gao,2 Erik Nijkamp,2 Song-Chun Zhu,2 and Ying Nian Wu21Hikvision Research Institute, Santa Clara, California 95054, USA2Department of Statistics, University of California, Los Angeles, California 90095, USA; email: [email protected]
        Annual Review of Statistics and Its Application Vol. 7: 303 - 335
        • ...If we generalize the linear mapping from h to x to a nonlinear mapping parameterized by a deep network (LeCun et al. 1998, Krizhevsky et al. 2012), ...
        • ...such as deep neural networks (LeCun et al. 1998, Krizhevsky et al. 2012), ...
        • ...One consists of convolutional neural networks (LeCun et al. 1998, Krizhevsky et al. 2012), ...
        • ...The two models are jointly learned on 30,000 handwritten digit images from the MNIST database (LeCun et al. 1998) conditioned on their class labels, ...
      • Deep Learning: The Good, the Bad, and the Ugly

        Thomas SerreDepartment of Cognitive Linguistic and Psychological Sciences, Carney Institute for Brain Science, Brown University, Providence, Rhode Island 02818, USA; email: [email protected]
        Annual Review of Vision Science Vol. 5: 399 - 426
        • ...The network shared many architectural details with earlier so-called feedforward hierarchical models of the visual cortex (see, e.g., Fukushima 1980, LeCun et al. 1998, Riesenhuber & Poggio 1999)....
        • ...Even the distinction between convolutional layers for building visual feature representations and fully connected layers for classification was already present in these early models (Fukushima 1980, LeCun et al. 1998, Riesenhuber & Poggio 1999)....
        • ... and their computer vision relatives (LeCun et al. 1998, Krizhevsky et al. 2012) is the degree to which visual representations are constrained by task demand....
        • ...earlier networks included eight layers or fewer (Fukushima 1980, Krizhevsky et al. 2012, LeCun et al. 1998, Riesenhuber & Poggio 1999)....
        • ...the property that motivated the original architectural design behind CNNs (Fukushima 1980, LeCun et al. 1998, Riesenhuber & Poggio 1999), ...
      • Deep Learning and Its Application to LHC Physics

        Dan Guest,1 Kyle Cranmer,2 and Daniel Whiteson11Department of Physics and Astronomy, University of California, Irvine, California 92697, USA2Physics Department, New York University, New York, NY 10003, USA
        Annual Review of Nuclear and Particle Science Vol. 68: 161 - 181
        • ...which is arguably the most important innovation in deep learning applied to image processing (21)....
      • The Role of Variability in Motor Learning

        Ashesh K. Dhawale,1,2 Maurice A. Smith,2,3 and Bence P. Ölveczky1,21Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138; email: [email protected]2Center for Brain Science, Harvard University, Cambridge, Massachusetts 021383John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138
        Annual Review of Neuroscience Vol. 40: 479 - 498
        • ...been due to the use of convolutional network architectures that reduce dramatically the dimensionality of the solution space by enforcing highly symmetric patterns in the weights to be learned (LeCun et al. 1998, 2015...

    • 77. 
      Glorot X, Bengio Y. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, pp. 249–56. Brookline, MA: Microtome
      • Google Scholar
      Article Location
    • 78. 
      Sutskever I, Martens J, Dahl GE, Hinton GE. 2013. On the importance of initialization and momentum in deep learning. In Proceedings of the 28th International Conference on Machine Learning, pp. 1139–47. New York: ACM
      • Google Scholar
      Article Location
    • 79. 
      Glorot X, Bordes A, Bengio Y. 2011. Deep sparse rectifier neural networks. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, ed. G Gordon, D Dunson, M Dudik, pp. 315–23. Brookline, MA: Microtome
      • Google Scholar
      Article Location
    • 80. 
      Maas AL, Hannun AY, Ng AY. 2013. Rectifier nonlinearities improve neural network acoustic models. In Proceedings of the 30th International Conference on Machine Learning, Workshop on Deep Learning for Audio, Speech, and Language Processing, p. 192. New York: ACM
      • Google Scholar
      Article Location
    • 81. 
      Wan L, Zeiler MD, Zhang S, LeCun Y, Fergus R. 2013. Regularization of neural networks using DropConnect. In Proceedings of the 30th International Conference on Machine Learning, pp. 1056–66. New York: ACM
      • Google Scholar
      Article Location
    • 82. 
      Cho ZH, Kim YB, Han JY, Min HK, Kim KN, et al. 2008. New brain atlas—mapping the human brain in vivo with 7.0 T MRI and comparison with postmortem histology: Will these images change modern medicine? Int. J. Imaging Syst. Technol. 18:2–8
      • Crossref
      • Web of Science ®
      • Google Scholar
      Article Location
    • 83. 
      Wu G, Qi F, Shen D. 2006. Learning-based deformable registration of MR brain images. IEEE Trans. Med. Imaging 25:1145–57
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 84. 
      Ou Y, Sotiras A, Paragios N, Davatzikos C. 2011. DRAMMS: deformable registration via attribute matching and mutual-saliency weighting. Med. Image Anal. 15:622–39
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 85. 
      Sotiras A, Davatzikos C, Paragios N. 2013. Deformable medical image registration: a survey. IEEE Trans. Med. Imaging 32:1153–90
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
      More AR articles citing this reference

      • Integrated Biophysical Modeling and Image Analysis: Application to Neuro-Oncology

        Andreas Mang,1, Spyridon Bakas,2, Shashank Subramanian,3 Christos Davatzikos,2 and George Biros31Department of Mathematics, University of Houston, Houston, Texas 77204, USA; email: [email protected]2Center for Biomedical Image Computing and Analytics (CBICA); Department of Radiology; and Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA; email: [email protected], [email protected]3Oden Institute of Computational Engineering and Sciences, The University of Texas at Austin, Austin, Texas 78712, USA; email: [email protected], [email protected]
        Annual Review of Biomedical Engineering Vol. 22: 309 - 341
        • ...Image registration has evolved into an indispensable tool in medical image analysis (154)....
      • Hamiltonian Systems and Optimal Control in Computational Anatomy: 100 Years Since D'Arcy Thompson

        Michael I. Miller,1,2,3 Alain Trouvé,5 and Laurent Younes1,41Center of Imaging Science,2Department of Biomedical Engineering,3Kavli Neuroscience Discovery Institute, and4Department of Applied Mathematics, The John Hopkins University, Baltimore, Maryland 21218; email: [email protected], [email protected]5CMLA, ENS Cachan, CNRS, Université Paris-Saclay, 94235 Cachan, France; email: [email protected]
        Annual Review of Biomedical Engineering Vol. 17: 447 - 509
        • ...The recent survey by Sotiras et al. (53) places this work in the greater context of deformable registration....

    • 86. 
      Lowe DG. 1999. Object recognition from local scale-invariant features. In Proceedings of the IEEE International Conference on Computer Vision. 8 pp. http://www.cs.ubc.ca/∼lowe/papers/iccv99.pdf
      • Google Scholar
      Article Location
    • 87. 
      Vercauteren T, Pennec X, Perchant A, Ayache N. 2009. Diffeomorphic demons: efficient non-parametric image registration. NeuroImage 45:S61–72
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Locations:
      • Article Location
      • Article Location
      More AR articles citing this reference

      • Hamiltonian Systems and Optimal Control in Computational Anatomy: 100 Years Since D'Arcy Thompson

        Michael I. Miller,1,2,3 Alain Trouvé,5 and Laurent Younes1,41Center of Imaging Science,2Department of Biomedical Engineering,3Kavli Neuroscience Discovery Institute, and4Department of Applied Mathematics, The John Hopkins University, Baltimore, Maryland 21218; email: [email protected], [email protected]5CMLA, ENS Cachan, CNRS, Université Paris-Saclay, 94235 Cachan, France; email: [email protected]
        Annual Review of Biomedical Engineering Vol. 17: 447 - 509
        • ...the stationary approach for large deformations, and the demon approach for large deformations (47...
        • ...Whereas greedy registration algorithms (2, 50, 102) stop whenever they obtain an exact registration, ...
      • Atlas-Based Neuroinformatics via MRI: Harnessing Information from Past Clinical Cases and Quantitative Image Analysis for Patient Care

        Susumu Mori,1,2 Kenichi Oishi,1 Andreia V. Faria,1 and Michael I. Miller31The Russell H. Morgan Department of Radiology and Radiological Science, The Johns Hopkins University School of Medicine, Baltimore, MD 21205; email: [email protected]2F.M. Kirby Research Center for Functional Brain Imaging, Kennedy Krieger Institute, Baltimore, MD 212053Department of Biomedical Engineering, The Johns Hopkins University School of Medicine, Baltimore, MD 21205
        Annual Review of Biomedical Engineering Vol. 15: 71 - 92
        • ...diffeomorphic transformation is designed to achieve greater accuracy even when the shape difference of two objects is large (77, 87–89)....
      • Deformable Medical Image Registration: Setting the State of the Art with Discrete Methods

        Ben Glocker1,4,Aristeidis Sotiras,2,Nikos Komodakis,3 and Nikos Paragios21Computer Aided Medical Procedures, Technische Universität München, 85748 Garching, Germany2Department of Applied Mathematics, École Centrale de Paris/INRIA Saclay, Ile-de-France, 92290 Orsay, France; email: [email protected]3Computer Science Department, University of Crete, Heraklion, Greece4Current address: Microsoft Research Cambridge, United Kingdom
        Annual Review of Biomedical Engineering Vol. 13: 219 - 244
        • ...whereas a purely iconic (13, 17, 48, 49, 50, 51) registration energy is similarly defined as...

    • 88. 
      Wu G, Kim M, Wang Q, Shen D. 2014. S-HAMMER: hierarchical attribute-guided, symmetric diffeomorphic registration for MR brain images. Hum. Brain Mapp. 35:1044–60
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Locations:
      • Article Location
      • Article Location
    • 89. 
      Liao S, Gao Y, Oto A, Shen D. 2013. Representation learning: a unified deep learning framework for automatic prostate MR segmentation. In Proceedings of the 2013 Medical Image Computing and Computer-Assisted Intervention Conference, pp. 254–61. Berlin: Springer
      • Google Scholar
      Article Locations:
      • Article Location
      • Article Location
    • 90. 
      Guo Y, Gao Y, Shen D. 2016. Deformable MR prostate segmentation via deep feature learning and sparse patch matching. IEEE Trans. Med. Imaging 35:1077–89
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Locations:
      • Article Location
      • Article Location
      • Article Location
    • 91. 
      Liao S, Gao Y, Shi Y, Yousuf A, Karademir I, et al. 2013. Automatic prostate MR image segmentation with sparse label propagation and domain-specific manifold regularization. Inf. Proc. Med. Imaging 23:511–23
      • Medline
      • Google Scholar
      Article Location
    • 92. 
      Kim M, Wu G, Shen D. 2013. Unsupervised deep learning for hippocampus segmentation in 7.0 Tesla MR images. In Lecture Notes in Computer Science, vol. 8184: Machine Learning in Medical Imaging, pp. 1–8. Berlin: Springer
      • Crossref
      • Google Scholar
      Article Location
    • 93. 
      Roth HR, Lee CT, Shin HC, Seff A, Kim L, et al. 2015. Anatomy-specific classification of medical images using deep convolutional nets. In Proceedings of the IEEE 12th International Symposium on Biomedical Imaging, pp. 293–303. Washington, DC: IEEE
      • Google Scholar
      Article Location
    • 94. 
      Yan Z, Zhan Y, Peng Z, Liao S, Shinagawa Y, et al. 2015. Bodypart recognition using multi-stage deep learning. In Proceedings of the 24th Conference on Information Processing in Medical Imaging, pp. 449–61. New York: ACM
      • Google Scholar
      Article Location
    • 95. 
      Yan Z, Zhan Y, Peng Z, Liao S, Shinagawa Y, et al. 2016. Multi-instance deep learning: Discover discriminative local anatomies for bodypart recognition. IEEE Trans. Med. Imaging 35:1332–43
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 96. 
      Maron O, Lozano-Pérez T. 1998. A framework for multiple-instance learning. In Proceedings of Neural Information Processing Systems (NIPS 1998), pp. 570–76. https://papers.nips.cc/paper/1346-a-framework-for-multiple-instance-learning
      • Google Scholar
      Article Location
    • 97. 
      Liu F, Yang L. 2015. A novel cell detection method using deep convolutional neural network and maximum-weight independent set. In Proceedings of the 2015 Medical Image Computing and Computer-Assisted Intervention Conference, pp. 349–57. Berlin: Springer
      • Google Scholar
      Article Location
    • 98. 
      Xie Y, Xing F, Kong X, Su H, Yang L. 2015. Beyond classification: structured regression for robust cell detection using convolutional neural network. In Proceedings of the 2015 Medical Image Computing and Computer-Assisted Intervention Conference, pp. 358–65. Berlin: Springer
      • Google Scholar
      Article Location
    • 99. 
      Xie Y, Kong X, Xing F, Liu F, Su H, Yang L. 2015. Deep voting: a robust approach toward nucleus localization in microscopy images. In Proceedings of the 2015 Medical Image Computing and Computer-Assisted Intervention Conference, pp. 374–82. Berlin: Springer
      • Google Scholar
      Article Location
    • 100. 
      Sirinukunwattana K, Raza SEA, Tsang YW, Snead DRJ, Cree IA, Rajpoot NM. 2016. Locality sensitive deep learning for detection and classification of nuclei in routine colon cancer histology images. IEEE Trans. Med. Imaging 35:1196–206
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 101. 
      Long J, Shelhamer E, Darrell T. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, pp. 371–80. Washington, DC: IEEE
      • Google Scholar
      Article Locations:
      • Article Location
      • Article Location
      • Article Location
    • 102. 
      Moeskops P, Viergever MA, Mendrik AM, de Vries LS, Benders MJNL, Iśgum I. 2016. Automatic segmentation of MR brain images with a convolutional neural network. IEEE Trans. Med. Imaging 35:1252–61
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 103. 
      Weisenfeld NI, Warfield SK. 2009. Automatic segmentation of newborn brain MRI. NeuroImage 47:564–72
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Locations:
      • Article Location
      • Article Location
      More AR articles citing this reference

      • Mapping Fetal Brain Development In Utero Using Magnetic Resonance Imaging: The Big Bang of Brain Mapping

        Colin StudholmeBiomedical Image Computing Group, Departments of Pediatrics, Bioengineering, and Radiology, University of Washington, Seattle, Washington 98195; email: [email protected], http://depts.washington.edu/bicg
        Annual Review of Biomedical Engineering Vol. 13: 345 - 368
        • ...The use of subject- or age-specific templates, in children (125, 126), neonates (127, 128, 129), ...

    • 104. 
      Xue H, Srinivasan L, Jiang S, Rutherford M, Edwards AD, et al. 2007. Automatic segmentation and reconstruction of the cortex from neonatal MRI. NeuroImage 38:461–77
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 105. 
      Gui L, Lisowski R, Faundez T, Hüppi PS, Lazeyras F, Kocher M. 2012. Morphology-driven automatic segmentation of MR images of the neonatal brain. Med. Image Anal. 16:1565–79
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 106. 
      Warfield S, Kaus M, Jolesz FA, Kikinis R. 2000. Adaptive, template moderated, spatially varying statistical classification. Med. Image Anal. 4:43–55
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
      More AR articles citing this reference

      • Mapping Fetal Brain Development In Utero Using Magnetic Resonance Imaging: The Big Bang of Brain Mapping

        Colin StudholmeBiomedical Image Computing Group, Departments of Pediatrics, Bioengineering, and Radiology, University of Washington, Seattle, Washington 98195; email: [email protected], http://depts.washington.edu/bicg
        Annual Review of Biomedical Engineering Vol. 13: 345 - 368
        • ...The use of subject- or age-specific templates, in children (125, 126), neonates (127, 128, 129), and even in adults (130), ...

    • 107. 
      Prastawa M, Gilmore JH, Lin W, Gerig G. 2005. Automatic segmentation of MR images of the developing newborn brain. Med. Image Anal. 9:457–66
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
      More AR articles citing this reference

      • Mapping Fetal Brain Development In Utero Using Magnetic Resonance Imaging: The Big Bang of Brain Mapping

        Colin StudholmeBiomedical Image Computing Group, Departments of Pediatrics, Bioengineering, and Radiology, University of Washington, Seattle, Washington 98195; email: [email protected], http://depts.washington.edu/bicg
        Annual Review of Biomedical Engineering Vol. 13: 345 - 368
        • ...The use of subject- or age-specific templates, in children (125, 126), neonates (127, 128, 129), ...

    • 108. 
      Wang L, Shi F, Lin W, Gilmore JH, Shen D. 2011. Automatic segmentation of neonatal images using convex optimization and coupled level sets. NeuroImage 58:805–17
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 109. 
      Wang L, Shi F, Li G, Gao Y, Lin W, et al. 2014. Segmentation of neonatal brain MR images using patch-driven level sets. NeuroImage 84:141–58
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 110. 
      Wang L, Gao Y, Shi F, Li G, Gilmore JH, et al. 2015. Links: learning-based multi-source integration framework for segmentation of infant brain images. NeuroImage 108:160–72
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 111. 
      Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y. 2013. OverFeat: integrated recognition, localization and detection using convolutional networks. arXiv:1312.6229 [cs.CV]
      • Google Scholar
      Article Locations:
      • Article Location
      • Article Location
    • 112. 
      Gao M, Bagci U, Lu L, Wu A, Buty M, et al. 2016. Holistic classification of CT attenuation patterns for interstitial lung diseases via deep convolutional neural networks. Comput. Methods Biomech. Biomed. Eng. 2016:1–6
      • Google Scholar
      Article Location
    • 113. 
      Krizhevsky A, Sutskever I, Hinton GE. 2012. Imagenet classification with deep convolutional neural networks. In Proceedings of Neural Information Processing Systems (NIPS 2012), pp. 1097–105. https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
      • Google Scholar
      Article Locations:
      • Article Location
      • Article Location
    • 114. 
      Krizhevsky A. 2009. Learning multiple layers of features from tiny images. Tech. rep., Dep. Comput. Sci., Univ. Toronto, Can.
      • Google Scholar
      Article Location
    • 115. 
      Simonyan K, Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 [cs.CV]
      • Google Scholar
      Article Location
    • 116. 
      Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, et al. 2015. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. Washington, DC: IEEE
      • Google Scholar
      Article Location
    • 117. 
      Lee CY, Xie S, Gallagher PW, Zhang Z, Tu Z. 2015. Deeply-supervised nets. In Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, pp. 562–70. Brookline, MA: Microtome
      • Google Scholar
      Article Location
    • 118. 
      Gönen M, Alpaydın E. 2011. Multiple kernel learning algorithms. J. Mach. Learn. Res. 12:2211–68
      • Web of Science ®
      • Google Scholar
      Article Location
    • 119. 
      Larochelle H, Bengio Y. 2008. Classification using discriminative restricted Boltzmann machines. In Proceedings of the 25th International Conference on Machine Learning, pp. 536–43. New York: ACM
      • Google Scholar
      Article Location
    • 120. 
      Plis SM, Hjelm D, Salakhutdinov R, Allen EA, Bockholt HJ, et al. 2014. Deep learning for neuroimaging: a validation study. Front. Neurosci. 8:229
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location
    • 121. 
      Kim J, Calhoun VD, Shim E, Lee JH. 2016. Deep neural network with weight sparsity control and pre-training extracts hierarchical features and enhances classification performance: evidence from whole-brain resting-state functional connectivity patterns of schizophrenia. NeuroImage 124:127–46
      • Crossref
      • Medline
      • Web of Science ®
      • Google Scholar
      Article Location

    More AR articles citing this reference

    Equation(s):

    (1)

    Equation(s):

    (3)

    Equation(s):

    (1)

    Equation(s):

    (2)

    Equation(s):

    (3)

    Equation(s):

    (4)

    Equation(s):

    (5)

    Equation(s):

    (6)

    Equation(s):

    (7)

    Equation(s):

    (8)

    Equation(s):

    (9)

    Footnotes:

    1In general, the input layer is not counted.

    Footnotes:

    2.

    Footnotes:

    3For simplicity, bias parameters are omitted.

    Footnotes:

    4For details, refer to http://ludo17.free.fr/mitos_2012/index.html.

    Footnotes:

    5For details, refer to http://mitos-atypia-14.grand-challenge.org/.

    Footnotes:

    6D(A, B)=2(A∩B)/(A+B), where ∩ is the intersection.

    Footnotes:

    7For details, refer to http://martinos.org/qtim/miccai2013/.

    • Figures
    image
    image
    image
    image
    image
    image
    image
    image
    image
    image
    • Figures
    image

    Figure 1  Architectures of two feed-forward neural networks.

    Download Full-ResolutionDownload PPT

    Figure Locations

    ...A perceptron, or a modified perceptron with multiple output units (Figure 1a), ...

    ...For a two-layer neural network (Figure 1b), also known as a multilayer perceptron, ...

    image

    Figure 2  Three representative deep models with vectorized inputs for unsupervised feature learning. The red links, whether directed or undirected, denote the full connections of units in two consecutive layers but no connections among units in the same layer. Note the differences among models in directed/undirected connections and the directions of the connections that depict conditional relationships.

    Download Full-ResolutionDownload PPT

    Figure Locations

    ...But when multiple auto-encoders are stacked (Figure 2a) in a configuration called an SAE, ...

    ...A DBN has one visible layer v and a series of hidden layers h(1), …, h(L) (Figure 2b)....

    ...all the layers in DBMs form an undirected generative model following the stacking of RBMs (Figure 2c)....

    ...Let us consider a three-layer DBM, namely the L=2 DBM shown in Figure 2c....

    image

    Figure 3  Three key mechanisms (i.e., local receptive field, weight sharing, and subsampling) in convolutional neural networks.

    Download Full-ResolutionDownload PPT

    Figure Locations

    ...and subsampling (Figure 3)—that greatly reduce the degrees of freedom in a model. ...

    image

    Figure 4  Construction of a deep encoder–decoder via a stacked auto-encoder and visualization of the learned feature representations. The blue circles represent high-level feature representations. The yellow and purple circles indicate the correspondence between layers in the encoder and decoder.

    Download Full-ResolutionDownload PPT

    Figure Locations

    ...Their SAE model consisted of encoding and decoding modules hierarchically (Figure 4)....

    ...and so forth, until they obtained high-level feature representations (Figure 4)....

    image

    Figure 5  Similarity maps identifying the correspondence for the point indicated by the red cross in the template (a) with regard to the subject (b) by hand-designed features (d,e) and by stacked auto-encoder (SAE) features learned through unsupervised deep learning (f). The registered subject image is shown in panel c. Clearly, inaccurate registration results might undermine supervised feature representation learning, which relies strongly on the correspondences across all training images. In panels d–f, the different colors of the voxels indicate their likelihood of being selected as correspondence for their respective locations. Abbreviation: SIFT, scale-invariant feature transform.

    Download Full-ResolutionDownload PPT

    Figure Locations

    ...Figure 5 demonstrates the power of feature representations learned by deep learning methods. Figure 5a–c shows a typical image registration result for brain images of an elderly patient, ...

    ...Clearly, the deformed subject image in Figure 5c is far from being well registered with the template image in Figure 5a...

    ... is far from being well registered with the template image in Figure 5a, ...

    ...local patches and scale-invariant feature transform (SIFT) (86)] either detect too many noncorresponding points when using the entire intensity patch as the feature vector (Figure 5d)...

    ... or have too-low responses and thus miss the correspondence when using SIFT (Figure 5e)....

    image

    Figure 6  Typical registration results on 7.0-T magnetic resonance images of the brain by (c) Demons (87), (d) HAMMER (88), and (e) HAMMER combined with stacked auto-encoder (SAE)-learned feature representations. The three rows represent three different slices of the template, subject, and registered subjects. The manually labeled hippocampus on the template image and the deformed subject's hippocampus by different registration methods are marked by red and blue contours, respectively.

    Download Full-ResolutionDownload PPT

    Figure Locations

    ...Wu et al. obtained deformable image registration results over various public data sets (Figure 6)....

    ...Compared with the state-of-the-art registration methods of intensity-based diffeomorphic Demons (87) and feature-based HAMMER (88) for 1.5- and 3.0-T MR images, the SAE-learned feature representation depicted in Figure 6e performs better. ...

    image

    Figure 7  Typical prostate segmentation results of two different patients produced by three different feature representations. Red contours indicate manual ground-truth segmentations, and yellow contours indicate automatic segmentations. The second and fourth rows present a three-dimensional (3D) visualization of the segmentation results corresponding to the images above. For each 3D visualization, the red surfaces indicate the automatic segmentation results using different features, such as intensity, hand-designed features, and stacked auto-encoder (SAE)-learned features. The transparent gray surfaces represent ground-truth segmentations.

    Download Full-ResolutionDownload PPT

    Figure Locations

    ...a deformable model was employed to segment the prostate by combining the shape prior with the prostate likelihood map derived from sparse patch matching. Figure 7 shows typical prostate segmentation results from different patients, ...

    image

    Figure 8  The architecture of the fully convolutional network used for tissue segmentation in Reference 48.

    Download Full-ResolutionDownload PPT

    Figure Locations

    ...Nie et al. (48) proposed the use of multiple fully convolutional networks (mFCNs) (Figure 8) to segment isointense-phase brain images with T1-weighted, ...

    ...they fused multiple-modality features from the high layer of each network (Figure 8)....

    image

    Figure 9  (a) Shared feature learning from patches of different modalities, such as magnetic resonance imaging (MRI) and positron emission tomography (PET), with a discriminative multimodal deep Boltzmann machine (DBM). The yellow circles represent the input patches, and the blue circles show joint feature representation. (b,c) Visualization of the learned weights in Gaussian restricted Boltzmann machines (RBMs) (bottom) and those of the first hidden layer (top) from MRI and PET pathways in a multimodal DBM (29). Each column, with 11 patches in the upper block and the lower block, composes a three-dimensional patch.

    Download Full-ResolutionDownload PPT

    Figure Locations

    ...and then devised a systematic method for a joint feature representation (Figure 9a) from the paired patches of MRI and PET with a multimodal DBM....

    ...the top hidden layer was connected to both the lower hidden layer and the additional label layer that indicated the label of the input patches (Figure 9a)....

    ...the authors trained a multimodal DBM to discover hierarchical and discriminative feature representations by integrating the process of discovering features of inputs with their use in classification. Figure 9b,c shows the learned connection weights from the MRI pathway and the PET pathway. ...

    image

    Figure 10  Functional networks learned from the first hidden layer of the deep auto-encoder from Reference 33. The functional networks in the left column correspond to (from top to bottom) the default-mode network, executive attention network, visual network, subcortical regions, and cerebellum. The functional networks in the right column show the relations among regions of different networks, cortices, and cerebellum.

    Download Full-ResolutionDownload PPT

    Figure Locations

    ...they devised a deep auto-encoder (DAE) by stacking multiple RBMs in order to discover hierarchical nonlinear functional relations among brain regions. Figure 10 shows examples of the learned connection weights in the form of functional networks....

    Previous Article Next Article
    • Related Articles
    • Literature Cited
    • Most Downloaded
    Most Downloaded from this journal

    Deep Learning in Medical Image Analysis

    Dinggang Shen, Guorong Wu, Heung-Il Suk
    Vol. 19, 2017

    Abstract - FiguresPreview

    Abstract

    This review covers computer-assisted analysis of images in the field of medical imaging. Recent advances in machine learning, especially with regard to deep learning, are helping to identify, classify, and quantify patterns in medical images. At the core ...Read More

    • Full Text HTML
    • Download PDF
    • Figures
    image

    Figure 1: Architectures of two feed-forward neural networks.

    image

    Figure 2: Three representative deep models with vectorized inputs for unsupervised feature learning. The red links, whether directed or undirected, denote the full connections of units in two consecut...

    image

    Figure 3: Three key mechanisms (i.e., local receptive field, weight sharing, and subsampling) in convolutional neural networks.

    image

    Figure 4: Construction of a deep encoder–decoder via a stacked auto-encoder and visualization of the learned feature representations. The blue circles represent high-level feature representations. The...

    image

    Figure 5: Similarity maps identifying the correspondence for the point indicated by the red cross in the template (a) with regard to the subject (b) by hand-designed features (d,e) and by stacked auto...

    image

    Figure 6: Typical registration results on 7.0-T magnetic resonance images of the brain by (c) Demons (87), (d) HAMMER (88), and (e) HAMMER combined with stacked auto-encoder (SAE)-learned feature repr...

    image

    Figure 7: Typical prostate segmentation results of two different patients produced by three different feature representations. Red contours indicate manual ground-truth segmentations, and yellow conto...

    image

    Figure 8: The architecture of the fully convolutional network used for tissue segmentation in Reference 48.

    image

    Figure 9: (a) Shared feature learning from patches of different modalities, such as magnetic resonance imaging (MRI) and positron emission tomography (PET), with a discriminative multimodal deep Boltz...

    image

    Figure 10: Functional networks learned from the first hidden layer of the deep auto-encoder from Reference 33. The functional networks in the left column correspond to (from top to bottom) the default...


    The Effect of Nanoparticle Size, Shape, and Surface Chemistry on Biological Systems

    Alexandre Albanese, Peter S. Tang, and Warren C.W. Chan
    Vol. 14, 2012

    Abstract - FiguresPreview

    Abstract

    An understanding of the interactions between nanoparticles and biological systems is of significant interest. Studies aimed at correlating the properties of nanomaterials such as size, shape, chemical functionality, surface charge, and composition with ...Read More

    • Full Text HTML
    • Download PDF
    • Figures
    image

    Figure 1: Overview of nano-bio interactions and their impact on the nanoengineering process. Typically, nanoparticles with a single or combination of known variable(s) (e.g., size, or size and surface...

    image

    Figure 2: Nanoparticle-cell interactions. (a) List of factors that can influence nanoparticle-cell interactions at the nano-bio interface. (b) Ligand-coated nanoparticles interacting with cells. The ...

    image

    Figure 3: Nanoparticles in tumor-specific delivery. Nanoparticles can be injected into a patient's blood and accumulate at the site of the tumor owing to enhanced permeation and retention. This prefer...

    image

    Figure 4: Evolution of nanoparticle design, highlighting the interplay between evolution of nanomaterial design and fundamental nano-bio studies. Abbreviations: Ab, antibody; EPR, enhanced permeation ...


    Fluid Dynamics of Respiratory Infectious Diseases

    Lydia Bourouiba
    Vol. 23, 2021

    Abstract - FiguresPreview

    Abstract

    The host-to-host transmission of respiratory infectious diseases is fundamentally enabled by the interaction of pathogens with a variety of fluids (gas or liquid) that shape pathogen encapsulation and emission, transport and persistence in the environment,...Read More

    • Full Text HTML
    • Download PDF
    • Figures
    image

    Figure 1: Core ideas about germ theory and transmission and their implications for epidemiology and public health, stemming from the legacy of Pasteur, Koch, Snow (not shown), Flügge, and Wells, estab...

    image

    Figure 2: The isolated respiratory drop emission paradigm, which remains the foundation of current infection control guidelines: the dichotomy between isolated small- and large-droplet respiratory emi...

    image

    Figure 3: (a) Paradigm shift from Wells's isolated droplet picture to that of the high-momentum turbulent (high-Re) multiphase exhalation cloud that carries droplets much further than if they were emi...

    image

    Figure 4: (a) Exhaled air with initial volume V0 and momentum I0 containing mucosalivary droplets of a given size distribution forms the multiphase cloud of initial density ρc(0) and initial buoyancy ...

    image

    Figure 5: Integrated PASS infection control management. (a) Masks reduce the forward momentum of the turbulent gas cloud and its droplet payload, though poor seals allow the gas cloud to follow the pa...

    image

    Figure 6: (a) Droplet size distributions from the literature (69–93) comparing respiratory emissions under a range of conditions; measured with different instrumentation and at different distances fro...

    image

    Figure 7: Compilation of results from the literature on quantification of droplet concentrations in a range of respiratory emissions from both infected and healthy subjects, showing a wide range of va...

    image

    Figure 8: (a) Sequence of emission of mucosalivary fluid (MS) from the respiratory tract (RT) during violent exhalations. The bulk of MS transforms into sheets that pierce with fluid retraction into l...


    Neural Stimulation and Recording Electrodes

    Stuart F. Cogan
    Vol. 10, 2008

    Abstract - FiguresPreview

    Abstract

    Electrical stimulation of nerve tissue and recording of neural electrical activity are the basis of emerging prostheses and treatments for spinal cord injury, stroke, sensory deficits, and neurological disorders. An understanding of the electrochemical ...Read More

    • Full Text HTML
    • Download PDF
    • Figures
    image

    Figure 1: Typical charge-balanced, current waveforms used in neural stimulation. The parameters vary widely depending on the application and size of the electrode. Waveform parameters usually falling ...

    image

    Figure 2: Capacitive (TiN), three-dimensional faradaic (iridium oxide), and pseudocapacitive (Pt) charge-injection mechanisms.

    image

    Figure 3: Scanning electron micrograph of the porous surface of sputtered TiN that gives rise to a high ESA/GSA ratio.

    image

    Figure 4: Schematic view of a pore cross-section showing the pore resistance (R1‥R3) and double-layer capacitance (C1‥C3) elements that give rise to a delay-line and time-constant for accessing all th...

    image

    Figure 5: An AIROF microelectrode for intracortical stimulation and recording.

    image

    Figure 6: A CV of AIROF in phosphate buffered saline (PBS) at 50 mV s−1. The time integral of the negative current, shown by the blue region of the voltammogram, represents a CSCc of 23 mC cm−2.

    image

    Figure 7: Comparison of cyclic voltammograms of platinum, SIROF, and smooth TiN macroelectrodes (GSA = 1.4 cm2) in PBS at a sweep rate of 20 mV s−1. 1, 2 indicate Pt oxidation and reduction; 3, 4 indi...

    image

    Figure 8: A comparison of the difference in response of 50 mV s−1 and 50,000 mV s−1 CVs of an AIROF microelectrode implanted in cat cortex within one day following implantation and six weeks after imp...

    image

    Figure 9: Impedance of an AIROF microelectrode (GSA = 940 μm2) in three electrolytes of different ionic conductivities but fixed phosphate buffer concentration. The conductivities are determined by th...

    image

    Figure 10: Impedance of an AIROF microelectrode (same as Figure 9) in PBS and unbuffered saline of similar ionic conductivities. The low-frequency charge-transfer impedance increases with decreasing b...

    image

    Figure 11: Comparison of the impedance of a smooth and porous TiN film demonstrating the reduction in impedance realized with a highly porous electrode coatings.

    image

    Figure 12: Impedance of SIROF coatings on PtIr macroelectrodes as a function of thickness.

    image

    Figure 13: A voltage transient of an AIROF microelectrode in response to a biphasic, symmetric (ic = ia) current pulse.

    image

    Figure 14: Comparison of voltage transients of an AIROF microelectrode pulsed at 48 nC phase−1 at pulsewidths from 0.1–0.5 ms.

    image

    Figure 15: Comparison of the initial and final Va for an AIROF microelectrode showing the large Va at the end of the current pulse when the AIROF is reduced.

    image

    Figure 16: Charge-injection capacity as a function of electrode area. The importance of nonuniform current distributions and transport limitations in determining Qinj are reflected in the area depende...

    image

    Figure 17: Comparison of in vivo and in vitro voltage transients of an AIROF electrode pulsed in an inorganic model of interstitial fluid (model-ISF) and subretinally in rabbit.

    image

    Figure 18: Comparison of the CV response of an AIROF electrode in PBS, model-ISF, and subretinally in rabbit.

    image

    Figure 19: Comparison of the impedance magnitude of an AIROF electrode in model-ISF and subretinally in rabbit.


    Glutaminolysis: A Hallmark of Cancer Metabolism

    Lifeng Yang, Sriram Venneti, Deepak Nagrath
    Vol. 19, 2017

    Abstract - FiguresPreview

    Abstract

    Glutamine is the most abundant circulating amino acid in blood and muscle and is critical for many fundamental cell functions in cancer cells, including synthesis of metabolites that maintain mitochondrial metabolism; generation of antioxidants to remove ...Read More

    • Full Text HTML
    • Download PDF
    • Figures
    image

    Figure 1: Amino acid metabolic pathways in cancer cells. This detailed schematic depicts the involvement of essential amino acids and nonessential amino acids in protein synthesis, central carbon meta...

    image

    Figure 2: Glutamine anaplerosis into the TCA cycle. Glutamine is taken up via ASCT2 (SLC1A5) and is converted into glutamate. Glutamate is metabolized to α-KG through the action of either GLUD or tran...

    image

    Figure 3: Oncogenic signaling, tumor suppressor, and tumor microenvironment effects on glutamine metabolism. Expression levels of enzymes involved in the glutaminolysis pathway are regulated by intrin...

    image

    Figure 4: Glutamine provides carbon and nitrogen sources for cells. (a) Glutamine donates amide and amino nitrogens for purine, nonessential amino acid, and glucosamine synthesis. The green rectangles...

    image

    Figure 5: Metabolic pathways control NADPH and ROS balance. Glucose enters the pentose phosphate pathway to generate two NADPH molecules via G6PD and 6PGDH. Serine derived from 3-phosphate glycerate o...

    image

    Figure 6: Roles of glutamine in tumor proliferation. Glutamine is taken up by cells via ASCT2 (SLC1A5) and is exported out of the cytoplasm by SLC7A5 to enable uptake of leucine. Leucine binds to Sest...

    image

    Figure 7: Roles of glutamine in the regulation of tumor metastasis, apoptosis, and epigenetics. (a) ROS activate cytochrome c release from mitochondria, which in turn trigger the caspase apoptotic pat...

    image

    Figure 8: Multiple sources maintain intracellular glutamine levels in cancer cells. (a) Cancer cells can generate glutamine through glutamine anabolism. De novo glutamine synthesis is mediated by the ...

    image

    Figure 9: 18F-glutamine uptake, positron emission tomography (PET) imaging, and SLC1A5 expression in several cancer. (a) 18F-glutamine uptake is mediated mainly by the glutamine transporter SCL1A5 in ...


    See More
    • © Copyright 2022
    • Contact Us
    • Email Preferences
    • Annual Reviews Directory
    • Multimedia
    • Supplemental Materials
    • FAQs
    • Privacy Policy
    Back to Top

    PRIVACY NOTICE

    Accept

    This site requires the use of cookies to function. It also uses cookies for the purposes of performance measurement. Please see our Privacy Policy.