205 / 2023-09-14 22:20:41
Research on building extraction based on CBAM VGG16-UNet semantic segmentation model
U-Net, VGG16, CBAM, building extraction, WHU building data set
全文待审
治国 吴 / 安徽理工大学
兴旺 赵 / 安徽理工大学
The accurate acquisition of two-dimensional contour information of buildings is of great significance in the fields of three-dimensional reconstruction of buildings, urban change detection, and disaster emergency response. With the development of science and technology, the number of high-resolution remote sensing satellites is gradually increasing, and the high spatial resolution remote sensing images provided can more fully express the texture information between different features, which provides strong data support for building extraction. However, in building extraction from remote sensing images, there are often problems such as the loss of edge information of large buildings, discontinuous contours, "hollow" phenomenon, and the "missed detection" and "wrong detection" of small buildings, etc. To deal with the above phenomena, we have developed a new approach to building extraction. Aiming at the above phenomena, this study proposes a CBAM VGG16-UNet network incorporating dual-attention mechanism for building extraction. The network is based on U-Net network architecture. In the downsampling part, the encoder part of the U-Net network is replaced with the first five convolutional blocks of VGG16, which is used to increase the depth of the network and reduce the parameters. The dual-attention mechanism CBAM was introduced for each feature fusion in the up-sampling and the transposed convolution of U-Net was replaced with bilinear interpolation to improve the ability of the network to extract features. In this study, model validation was carried out using the WHU building dataset as well as the self-made Guiyang building dataset, while three common networks for extracting buildings, Mobile-UNet, U-Net, and VGG16-UNet, were analyzed as comparative models, and eight sets of experimental results were obtained from the four networks on the two types of datasets respectively. Four common evaluation indexes of semantic segmentation, Precision, Recall, F1-score, and IoU, were used to quantitatively analyze the experimental results, and the visual decoding method was used to comparatively analyze the extracted graphs of the training results obtained from five buildings with large scale differences of the four networks on the two types of datasets, which were selected respectively. The experiments show that CBAM VGG16-UNet achieves 94.90%, 95.46%, 95.18%, and 90.80% precision, recall, F1-score, and IoU on the WHU buildings dataset, and 77.53%, 84.46%, 84.46%, and 67.85% precision, recall, F1-score, and IoU on the Guiyang buildings dataset, 80.85%, and 67.85%, outperforming the three comparison models on both types of datasets. This study provides a new idea to solve the common problem of building extraction, which has some engineering application value.
重要日期
  • 会议日期

    10月26日

    2023

    10月29日

    2023

  • 10月15日 2023

    摘要截稿日期

  • 10月15日 2023

    初稿截稿日期

  • 11月13日 2023

    注册截止日期

主办单位
国际矿山测量协会
中国煤炭学会
中国测绘学会
承办单位
中国矿业大学
中国煤炭科工集团有限公司
移动端
在手机上打开
小程序
打开微信小程序
客服
扫码或点此咨询