位置强化与双路径边缘增强解码的盲道分割方法Blind Lane Segmentation Method Based on Position-Enhanced and Dual-Path Edge-Enhanced Decoding
翁静梁,王龙业,曾晓莉,刘梦瑶,易婷
摘要(Abstract):
针对盲道分割任务中背景复杂、边界模糊以及盲道形状多样化导致的分割效果不佳等问题,提出了一种位置强化与双路径边缘增强解码的盲道分割方法。设计了位置强化特征提取模块,通过在降采样过程中融入MobileVitv3构成的主干网络,增强模型对盲道特征的感知能力,充分保留上下文信息;提出了融合通道位置强化注意力模块,分别在通道和空间维度上强化特征提取,提升模型在低对比度场景下有效区分盲道与背景信息的能力;采用双路径边缘增强解码方式,对盲道的区域与边界信息进行解码,并结合联合损失函数进一步优化边界细节的处理。此外,针对当前缺乏大规模公开盲道数据集的问题,自制了一个多场景盲道数据集(MSBD),为模型训练和实验验证提供了更丰富的数据支持。实验结果表明,该网络在MSBD数据集上的mIoU、Precision、Recall以及F1-score分别达到96.82%、96.84%、96.48%、96.66%,均优于SegFormer、Deeplabv3+等网络;在输入图片大小为512×512×3时,参数量和计算量分别为1.73×10~6和1.93 GFLOPs,且推理帧率可达86 FPS,综合性能优于所对比网络;同时该网络在公开盲道人行道数据集(BACD)和Cityscapes数据集上的综合指标也优于所对比网络。
关键词(KeyWords): 盲道分割;位置强化;注意力模块;多场景;边界
基金项目(Foundation): 国家自然科学基金(62161047)~~
作者(Author): 翁静梁,王龙业,曾晓莉,刘梦瑶,易婷
参考文献(References):
- [1] BOURNE R, STEINMETZ J D, FLAXMAN S, et al. Trends in prevalence of blindness and distance and near vision impairment over 30 years:an analysis for the global burden of disease study[J]. The Lancet Global Health, 2021, 9(2):130-143.
- [2] ALVAREZ J Má,?OPEZ A M. Road detection based on illuminant invariance[J]. IEEE Transactions on Intelligent Transportation Systems, 2011, 12(1):184-193.
- [3] ZHANG H, YE C. An indoor wayfinding system based on geometric features aided graph SLAM for the visually impaired[J]. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2017, 25(9):1592-1604.
- [4] BAI J Q, LIAN S G, LIU Z X, et al. Virtual-blind-road following-based wearable navigation device for blind people[J]. IEEE Transactions on Consumer Electronics, 2018, 64(1):136-143.
- [5] WEI Y L, LEE M. A guide-dog robot system research for the visually impaired[C]//Proceedings of the 2014 IEEE International Conference on Industrial Technology. Piscataway:IEEE, 2014:800-805.
- [6]赵磊,李振伟,杨晓利,等.一种基于图像处理的提示盲道检测方法[J].计算机技术与发展, 2021, 31(2):91-96.ZHAO L, LI Z W, YANG X L, et al. A warning blind sidewalk detection method based on image processing[J]. Computer Technology and Development, 2021, 31(2):91-96.
- [7]柯剑光,赵群飞,施鹏飞.基于图像处理的盲道识别算法[J].计算机工程, 2009, 35(1):189-191.KE J G, ZHAO Q F, SHI P F. Blind way recognition algorithm based on image processing[J]. Computer Engineering, 2009, 35(1):189-191.
- [8]彭玉青,薛杰,郭永芳.基于颜色纹理信息的盲道识别算法[J].计算机应用, 2014, 34(12):3585-3588.PENG Y Q, XUE J, GUO Y F. Blind road recognition algorithm based on color and texture information[J]. Journal of Computer Applications, 2014, 34(12):3585-3588.
- [9]王民,肖磊,杨放.基于显著性检测和改进投影字典对的盲道分割[J].激光与光电子学进展, 2017, 54(4):41001.WANG M, XIAO L, YANG F. Blind road segmentation based on saliency detection and improved projective dictionary pair[J]. Laser&Optoelectronics Progress, 2017, 54(4):41001.
- [10] YU T, DUAN Y. Blind road deviation detection algorithm based on threshold segmentation[EB/OL].[2024-06-12].https://dx.doi.org/10.2139/ssrn.4740536.
- [11]魏彤,周银鹤.基于机器学习识别与标记分水岭分割的盲道图像定位[J].光学精密工程, 2019, 27(1):201.WEI T, ZHOU Y H. Blind sidewalk image location based on machine learning recognition and marked watershed segmentation[J]. Optics and Precision Engineering, 2019, 27(1):201.
- [12]方超伟,李雪,李钟毓,等.基于双模型交互学习的半监督医学图像分割[J].自动化学报, 2023, 49(4):805-819.FANG C W, LI X, LI Z Y, et al. Interactive dual-model learning for semi-supervised medical image segmentation[J]. Acta Automatica Sinica, 2023, 49(4):805-819.
- [13] CAO Z C, XU X W, HU B, et al. Rapid detection of blind roads and crosswalks by using a lightweight semantic segmentation network[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(10):6188-6197.
- [14] CHEN J Z, BAI X Z. Atmospheric transmission and thermal inertia induced blind road segmentation with a large-scale dataset TBRSD[C]//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision. Piscataway:IEEE,2023:1053-1063.
- [15] XIA Y Q, LI Y Q, YE Q Q, et al. Image segmentation for blind lanes based on improved SegNet model[J]. Journal of Electronic Imaging, 2023, 32(1):013038.
- [16] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet:a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495.
- [17] MA Y N, XU Q, WANG Y, et al. EOS:an efficient obstacle segmentation for blind guiding[J]. Future Generation Computer Systems, 2023, 140:117-128.
- [18] WADEKAR S N, CHAURASIA A. MobileViTv3:mobilefriendly vision transformer with simple and effective fusion of local, global and input features[EB/OL].[2024-06-12].https://arxiv.org/abs/2209.15159.
- [19] CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2016:3213-3223.
- [20] WAN Q, HUANG Z, LU J, et al. SeaFormer:squeezeenhanced axial transformer for mobile semantic segmentation[EB/OL].[2024-06-12]. https://arxiv.org/abs/2301.13156.
- [21] ZHANG W Q, HUANG Z L, LUO G Z, et al. TopFormer:token pyramid transformer for mobile semantic segmentation[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2022:12073-12083.
- [22] ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2017:6230-6239.
- [23] SHELHAMER E, LONG J, DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(4):640-651.
- [24] YU C Q, WANG J B, PENG C, et al. BiSeNet:bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the 15th European Conference on Computer Vision. Cham:Springer, 2018:334-349.
- [25] BAI H T, WANG P H, ZHANG R F, et al. Seg Former:a topic segmentation model with controllable range of attention[J].Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(11):12545-12552.
- [26] CHEN L C, ZHU Y K, PAPANDREOU G, et al. Encoderdecoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the 15th European Conference on Computer Vision. Cham:Springer, 2018:833-851.
- [27] XU J C, XIONG Z X, BHATTACHARYYA S P. PIDNet:a real-time semantic segmentation network inspired by PID controllers[C]//Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2023:19529-19539.
- [28] YU C Q, GAO C X, WANG J B, et al. Bi Se Net V2:bilateral network with guided aggregation for real-time semantic segmentation[J]. International Journal of Computer Vision,2021, 129(11):3051-3068.
- [29] WOO S, PARK J, LEE J Y, et al. CBAM:convolutional block attention module[C]//Proceedings of the 15th European Conference on Computer Vision. Cham:Springer, 2018:3-19.
- [30] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2018:7132-7141.
- [31] WANG Q L, WU B G, ZHU P F, et al. ECA-Net:efficient channel attention for deep convolutional neural networks[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2020:11534-11542.
- [32] HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C]//Proceedings of the2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway:IEEE, 2021:13713-13722.