MSU-Net: Multi-Scale U-Net for 2D Medical Image Segmentation - PubMed (original) (raw)
MSU-Net: Multi-Scale U-Net for 2D Medical Image Segmentation
Run Su et al. Front Genet. 2021.
Abstract
Aiming at the limitation of the convolution kernel with a fixed receptive field and unknown prior to optimal network width in U-Net, multi-scale U-Net (MSU-Net) is proposed by us for medical image segmentation. First, multiple convolution sequence is used to extract more semantic features from the images. Second, the convolution kernel with different receptive fields is used to make features more diverse. The problem of unknown network width is alleviated by efficient integration of convolution kernel with different receptive fields. In addition, the multi-scale block is extended to other variants of the original U-Net to verify its universality. Five different medical image segmentation datasets are used to evaluate MSU-Net. A variety of imaging modalities are included in these datasets, such as electron microscopy, dermoscope, ultrasound, etc. Intersection over Union (IoU) of MSU-Net on each dataset are 0.771, 0.867, 0.708, 0.900, and 0.702, respectively. Experimental results show that MSU-Net achieves the best performance on different datasets. Our implementation is available at https://github.com/CN-zdy/MSU\_Net.
Keywords: U-net; convolution kernel; medical image segmentation; multi-scale block; receptive field.
Copyright © 2021 Su, Zhang, Liu and Cheng.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures
Figure 1
Detailed description of MSU-Net and multi-scale block (37). Panel (A) is the architecture of MSU-Net. The overall architecture is similar to the original U-Net. The dimensions of the network are represented by numbers on the block structure. Panel (B) is the architecture of multi-scale block (37). This module is embedded in the original U-Net to get MSU-Net.
Figure 2
The type of convolution kernel used in this article. Combining the above seven convolution kernels, different types of multi-scale blocks are proposed.
Figure 3
An overview of 31 multi-scale blocks. m-s block represents the multi-scale block. Different multi-scale blocks are designed according to several commonly used convolution kernels. More richer and diverse features can be extracted through this design. Meanwhile, the problem of unknown network width can be alleviated by this design. Conducive to intensive prediction tasks that require detailed spatial information.
Figure 4
Detailed description of multi-scale block. First, two 3X3 and 7X7 convolution kernels are used to extract features. Second, the extracted features are merged by the feature by cat. Finally, the fused features are output after dimensionality reduction by 1X1 convolution.
Figure 5
Arrangement of convolution kernel with different receptive fields and dilated convolution. Panels (A,B) lay out the convolution kernel in different order, respectively. In (C), the large convolution kernel in the multi-scale block (37) is replaced by dilated convolution. The receptive field of the convolution kernel is enlarged without increasing the number of parameters.
Figure 6
Residual multi-scale block. Panel (A) is the first structure designed to incorporate residual thinking. Panel (B) is the second. Experimental results show that the performance of (A) structure is better than (B). Panel (C) describes (A) in detail. The different multi-scale blocks are described in Figure 3. The information of the input features is directly transmitted to the deep layer of the network through the residual connection.
Figure 7
Qualitative comparison among SegNet, DeepLabV3+, U-Net, U-Net++, and MSU-Net. It shows the segmentation application of the architectures on five different biomedical image datasets. The red arrows indicate areas of incorrect segmentation. SegNet can not be trained on EM datasets. Therefore, the result of SegNet on the EM dataset is vacant. The ground truth is illustrated in the second column (from left to right).
Figure 8
ROC curves for different architectures. AUC is the area under the curve.
References
- Alom M. Z., Hasan M., Yakopcic C., Taha T. M., Asari V. K. (2018). Recurrent residual convolutional neural network based on U-Net (R2U-Net) for medical image segmentation. arXiv [preprint]. arXiv:1802.06955. 10.1109/NAECON.2018.8556686 -DOI
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases