Loading [MathJax]/extensions/MathMenu.js
Learning to Segment From Scribbles Using Multi-Scale Adversarial Attention Gates | IEEE Journals & Magazine | IEEE Xplore

Learning to Segment From Scribbles Using Multi-Scale Adversarial Attention Gates


Abstract:

Large, fine-grained image segmentation datasets, annotated at pixel-level, are difficult to obtain, particularly in medical imaging, where annotations also require expert...Show More

Abstract:

Large, fine-grained image segmentation datasets, annotated at pixel-level, are difficult to obtain, particularly in medical imaging, where annotations also require expert knowledge. Weakly-supervised learning can train models by relying on weaker forms of annotation, such as scribbles. Here, we learn to segment using scribble annotations in an adversarial game. With unpaired segmentation masks, we train a multi-scale GAN to generate realistic segmentation masks at multiple resolutions, while we use scribbles to learn their correct position in the image. Central to the model’s success is a novel attention gating mechanism, which we condition with adversarial signals to act as a shape prior, resulting in better object localization at multiple scales. Subject to adversarial conditioning, the segmentor learns attention maps that are semantic, suppress the noisy activations outside the objects, and reduce the vanishing gradient problem in the deeper layers of the segmentor. We evaluated our model on several medical (ACDC, LVSC, CHAOS) and non-medical (PPSS) datasets, and we report performance levels matching those achieved by models trained with fully annotated segmentation masks. We also demonstrate extensions in a variety of settings: semi-supervised learning; combining multiple scribble sources (a crowdsourcing scenario) and multi-task learning (combining scribble and mask supervision). We release expert-made scribble annotations for the ACDC dataset, and the code used for the experiments, at https://vios-s.github.io/multiscale-adversarial-attention-gates.
Published in: IEEE Transactions on Medical Imaging ( Volume: 40, Issue: 8, August 2021)
Page(s): 1990 - 2001
Date of Publication: 30 March 2021

ISSN Information:

PubMed ID: 33784616

Funding Agency:


I. Introduction

Convolutional Neural Networks (CNNs) have obtained impressive results in computer vision. However, their ability to generalize on new examples is strongly dependent on the amount of training data, thus limiting their applicability when annotations are scarce. There has been a considerable effort to exploit semi-supervised and weakly-supervised strategies. For semantic segmentation, semi-supervised learning (SSL) aims to use unlabeled images, generally easier to collect, together with some fully annotated image-segmentation pairs [1], [2]. However, the information inside unlabeled data can improve CNNs only under specific assumptions [1], and SSL requires representative image-segmentation pairs being available.

Contact IEEE to Subscribe

References

References is not available for this document.