Search
Close this search box.

Research / Projects / Uncertainty maps from Evidential Deep Learning

Uncertainty maps in segmentation from Evidential Deep Learning

Link to our work on the arXiv: To appear soon! 

This work has been selected to be featured in an oral session at the 66th American Association of Physics in Medicine (AAPM) conference, July 2024. 

Segmentation enacts a crucial role in radiology-related tasks. For example, segmentation of various organs and target regions on CT, MRI and PET scan images is a vital step in the radiotherapy treatment planning pipeline. In this project, we examined the usefulness of a deep learning model designed to perform segmentation on MRI images, and simultaneously equipped with an uncertainty quantification framework. 

The main uncertainty quantification framework we studied was originally due to Sensoy et al. who invoked elements of Bayesian theory and Dempster-Shafer theory of evidence to enable neural networks designed for classification tasks to express their prediction uncertainties. Shortly after their seminal work was released, Amini et al. published a highly interesting preprint (Deep Evidential Regression) which could be interpreted as the regression analogue of Sensoy’s work that was devoted to classification contexts. The common fundamental ingredient threading both these works is the notion of a `second-order’ probability distribution which, in the language of Bayesian statistics, can be understood simply as the prior distribution for the original/first-order probabilistic output. For both the models of Sensoy et al. and Amini et al., the neural network’s outputs are now the parameters of the prior/evidential/second-order distribution and the model’s training is guided by a suitable loss function that generalizes the usual mean-squared-error or likelihood function by taking into account the prior distribution.

Segmentation of some specific organ/region in a MRI image can be formulated as a classification task since each pixel can be defined as either belonging to that specific organ or not. This would be a binary classification task. If more than one region is intended to be identified, then we can extend the task to a multi-label classification. A conventional neural network adapted for this purpose would have its output for each pixel mapped to a value that lies between 0 and 1, indicating the probability \( p \) of the pixel being classified as within the organ segmentation mask. In evidential deep learning, the probabilistic output \( p \) is considered as the parameter for a Bernoulli distribution which is the discrete probability distribution for a random variable which takes the value 1 (within the segmentation mask) with probability \( p \) and the value 0 (outside the mask) with probability \( 1-p \). Further, we specify a prior distribution for \( p \), typically the beta (or Dirichlet) distribution for binary (or multi-class) classification. The final neural network’s outputs are then the parameters of this beta (or Dirichlet) distribution. The training dataset consists as usual of the contoured images. The ground truth is specified for the class label of each pixel, but not for the prior distribution’s parameters. However the loss function is a function of both the ground truth class labels and the prior distribution’s parameters. This is a crucial yet subtle difference between evidential deep learning and the standard supervised learning algorithm.

In our work, datasets are drawn from the Cardiac and Prostate MRI images curated at the Medical Segmentation Decathlon. We refined the loss function originally presented in Sensoy et al. for better segmentation accuracy. A main motivation for uncertainty quantification framework is that it enables us to be aware of the potential regions of ignorance of the neural network. This will work robustly if there is a good, stable correlation between uncertainty distribution and the model’s prediction error distribution. Hence, scrutinizing the level of such correlation is a key focus of our work.

 

On the right is a set of heatmaps (red representing higher values) and their corresponding MRI images we obtained from a trained Deep Evidential Classification framework embedding a U-Net backbone model for segmentation. The input images are multimodal prostate MRI (T2, ADC) datasets from 48 patients, with contours provided by Radboud University, Nijmegen Medical Centre.

In each row, the left picture overlays the model’s contour prediction (closed red loop) upon the ground truth prostate mask in yellow. The right picture depicts the uncertainty heatmap displayed separately for clarity. Redder regions represent localized areas of elevated uncertainty.

In the graphs above, we show how the empirical PDFs and CDFs of the uncertainty distributions for falsely and correctly labels were found to be visibly distinct, giving a holistic, quantitative description of how uncertainty estimates were associated with segmentation errors. 

Partitioning each contoured axial slice of MRI images into (i)true positive (ii)true negative (iii)false positive (iv)false negative regions, we graphed the uncertainty’s empirical probability distribution functions and cumulative distribution functions when localized separately within each. We found that falsely labeled regions (yellow and blue curves) exhibited an evidently wider spread of values (with bimodal-like PDFs). On the other hand, correctly labelled regions (red and green curves) are characterized by Gaussians with sharp peaks. To further quantify how the model-derived uncertainty identifies errors, we computed the Point-Biserial coefficient between the binary variable of false/true labels and uncertainty, finding it to be about 0.5, a value that typically indicates moderate-to-strong correlation.