Enhanced U-Net model for accurate aerial road segmentation

Main Article Content

Rayene Doghmane
Karima Boukari


Keywords : image analysis, image recognition, normalization techniques, batch group normalization, Image processing; Convolutional neural network; Semantic segmentation; Global one-dimensional pooling, BGN-UNet, aerial road detection
Abstract

In computer vision, Convolutional Neural Networks (CNNs) have become a foundation for image analysis. They excel in tasks such as object recognition, classification, and more, semantic segmentation. In order to achieve better accuracy, it is crucial to apply normalization techniques to the network for enhancing overall performance. This paper introduces an innovative approach that incorporates Batch Group Normalization (BGN) into the popular U-Net for binary semantic segmentation, with a particular focus on aerial road detection. Our research primarily focuses on evaluating the BGN-UNet's performance compared to traditional normalization techniques, such as Batch Normalization (BN) and Group Normalization (GN). With a batch size of 2, the U-Net model enhanced with Batch Group Normalization (BGN-UNet) achieves a remarkable Mean IoU of 98.4% in aerial road segmentation, demonstrating its superior accuracy in this task.

Article Details

How to Cite
Doghmane, R., & Boukari, K. (2024). Enhanced U-Net model for accurate aerial road segmentation. Machine Graphics and Vision, 33(3/4), 71–96. https://doi.org/10.22630/MGV.2024.33.3.4
References

Z. S. Abdallah, L. Du, and G. I. Webb. Data preparation. In: D. Phung, G. I. Webb, and C. Sammut (Eds.), Encyclopedia of Machine Learning and Data Science, pp. 1-10. Springer US, New York, NY. 2023. https://doi.org/10.1007/978-1-4899-7502-7_62-2. Living reference work entry [Accessed: 2023]. (Crossref)

A. Abdollahi, B. Pradhan, and A. Alamri. VNet: An end-to-end fully convolutional neural network for road extraction from high-resolution remote sensing data. Ieee Access 8:179424-179436. 2020. https://doi.org/10.1109/ACCESS.2020.3026658. (Crossref)

A. Abdollahi, B. Pradhan, and N. Shukla. Road extraction from high-resolution orthophoto images using convolutional neural network. Journal of the Indian Society of Remote Sensing 49:569-583. 2021. https://doi.org/10.1007/s12524-020-01228-y. (Crossref)

A. Abdollahi, B. Pradhan, N. Shukla, S. Chakraborty, and A. Alamri. Deep learning approaches applied to remote sensing datasets for road extraction: A state-of-the-art review. Remote Sensing 12(9):1444. 2020. https://doi.org/10.3390/rs12091444. (Crossref)

M. I. Ahmed, M. Foysal, M. D. Chaity, and A. B. M. A. Hossain. DeepRoadNet: A deep residual based segmentation network for road map detection from remote aerial image. IET Image Processing 18:265–279. 2023. https://doi.org/10.1049/ipr2.12948. (Crossref)

S. Arkhangelskiy. Data augmentation on GPU in Tensorflow. In: Becoming Human. Exploring Artificial Intelligence & What it Means to be Human. Medium. 2017. https://becominghuman.ai/data-augmentation-on-gpu-in-tensorflow-13d14ecf2b19.

J. L. Ba, J. R. Kiros, and G. E. Hinton. Layer normalization. In: Proc. Neural Information Processing Systems - NIPS 2016 Deep Learning Symposium. 2016. https://openreview.net/forum?id=BJLa_ZC9.

C.-I. Cira, M.-Á. Manso-Callejo, R. Alcarria, B. Bordel Sánchez, and J. González Matesanz. State-level mapping of the road transport network from aerial orthophotography: An end-to-end road extraction solution based on deep learning models trained for recognition, semantic segmentation and post-processing with conditional generative learning. Remote Sensing 15(8):2099. 2023. https://doi.org/10.3390/rs15082099. (Crossref)

L. Dai, G. Zhang, and R. Zhang. RADANet: Road augmented deformable attention network for road extraction from complex high-resolution remote-sensing images. IEEE Transactions on Geoscience and Remote Sensing 61:1-13. 2023. https://doi.org/10.1109/TGRS.2023.3237561. (Crossref)

A. Fakhri and R. Shah-Hosseini. Improved road detection algorithm based on fusion of deep convolutional neural networks and random forest classifier on VHR remotely-sensed images. Journal of the Indian Society of Remote Sensing 50(8):1409-1421. 2022. https://doi.org/10.1007/s12524-022-01532-9. (Crossref)

Google Research. Welcome to Colab. 2023. https://research.google.com/colaboratory/. Online Service [Accessed: 2023].

S. Hao, Y. Zhou, and Y. Guo. A brief survey on semantic segmentation with deep learning. Neurocomputing 406:302-321. 2020. https://doi.org/10.1016/j.neucom.2019.11.118. (Crossref)

X. Hu and H. Yang. DRU-net: a novel U-net for biomedical image segmentation. IET Image Processing 14(1):192-200. 2020. https://doi.org/10.1049/iet-ipr.2019.0025. (Crossref)

S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proc. 32nd International Conference on Machine Learning, vol. 37 of Proceedings of Machine Learning Research, pp. 448-456. PMLR. 2015. https://proceedings.mlr.press/v37/ioffe15.html.

V. M. Ionescu. CPU and GPU gray scale image conversion on mobile platforms. In: Proc. 2017 9th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), pp. 1-6. IEEE. 2017. https://doi.org/10.1109/ECAI.2017.8166501. (Crossref)

R. Jaturapitpornchai, M. Matsuoka, N. Kanemoto, S. Kuzuoka, R. Ito, et al. Newly built construction detection in SAR images using deep learning. Remote Sensing 11(12):1444. 2019. https://doi.org/10.3390/rs11121444. (Crossref)

B. Li, F. Wu, K. Q. Weinberger, and S. Belongie. Positional normalization. In: Proc. 32th Int. Conf. Neural Information Processing Systems (NeurIPS), pp. 1622-1634. Curran Associates. 2019. https://proceedings.neurips.cc/paper/2019/hash/6d0f846348a856321729a2f36734d1a7-Abstract.html.

K. Li, M. Tan, D. Xiao, T. Yu, Y. Li, et al. Research on road extraction from high-resolution remote sensing images based on improved UNet++. IEEE Access 12:50300-50309. 2024. https://doi.org/10.1109/ACCESS.2024.3385540. (Crossref)

S. Lin, X. Yao, X. Liu, S. Wang, H.-M. Chen, et al. MS-AGAN: Road extraction via multi-scale information fusion and asymmetric generative adversarial networks from high-resolution remote sensing images under complex backgrounds. Remote Sensing 15(13):3367. 2023. https://doi.org/10.3390/rs15133367. (Crossref)

P. Micikevicius, S. Narang, J. Alben, G. Diamos, E. Elsen, et al. Mixed precision training. In: Proc. 6th International Conference on Learning Representations (ICLR). 2018. https://openreview.net/forum?id=r1gs9JgRZ.

N. Narisetti, M. Henke, K. Neumann, F. Stolzenburg, T. Altmann, et al. Deep learning based greenhouse image segmentation and shoot phenotyping (deepshoot). Frontiers in Plant Science 13:906410. 2022. https://doi.org/10.3389/fpls.2022.906410. (Crossref)

R. O. Ogundokun, R. Maskeliunas, S. Misra, and R. Damaševičius. Improved CNN based on batch normalization and Adam optimizer. In: Proc. Computational Science and Its Applications - ICCSA 2022 Workshops, vol. 13381 of Lecture Notes in Computer Science, pp. 593-604. Springer. 2022. https://doi.org/10.1007/978-3-031-10548-7_43. (Crossref)

O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention - Proc. MICCAI 2015: 18th International Conference, vol. 9351 of Lecture Notes in Computer Science, pp. 234-241. Springer. 2015. https://doi.org/10.1007/978-3-319-24574-4_28. (Crossref)

A. Safonova, G. Ghazaryan, S. Stiller, M. Main-Knorn, C. Nendel, et al. Ten deep learning techniques to address small data problems with remote sensing. International Journal of Applied Earth Observation and Geoinformation 125:103569. 2023. https://doi.org/10.1016/j.jag.2023.103569. (Crossref)

T. Salimans and D. P. Kingma. Weight normalization: A simple reparameterization to accelerate training of deep neural networks. In: Advances in Neural Information Processing Systems 29 (NIPS 2016). Curran Associates. 2016. https://proceedings.neurips.cc/paper/2016/hash/ed265bc903a5a097f61d3ec064d96d2e-Abstract.html.

C. Shorten and T. M. Khoshgoftaar. A survey on image data augmentation for deep learning. Journal of Big Data 6:60. 2019. https://doi.org/10.1186/s40537-019-0197-0. (Crossref)

F. Sultonov, J.-H. Park, S. Yun, D.-W. Lim, and J.-M. Kang. Mixer U-Net: An improved automatic road extraction from UAV imagery. Applied Sciences 12(4):1953. 2022. https://doi.org/10.3390/app12041953. (Crossref)

T. Tiwari and M. Saraswat. A new modified-unet deep learning model for semantic segmentation. Multimedia Tools and Applications 82(3):3605-3625. 2023. https://doi.org/10.1007/s11042-022-13230-2. (Crossref)

D. Ulyanov, A. Vedaldi, and V. Lempitsky. Instance normalization: The missing ingredient for fast stylization. arXiv, arXiv:1607.08022. 2016. https://doi.org/10.48550/arXiv.1607.08022.

R. Wang, M. Cai, Z. Xia, and Z. Zhou. Remote sensing image road segmentation method integrating CNN-Transformer and UNet. IEEE Access 11:144446-144455. 2023. https://doi.org/10.1109/ACCESS.2023.3344797. (Crossref)

A. Wanto, A. P. Windarto, D. Hartama, and I. Parlina. Use of binary sigmoid function and linear identity in artificial neural networks for forecasting population density. International Journal of Information System and Technology 1(1):43-54. 2017. https://doi.org/10.30645/ijistech.v1i1.6. (Crossref)

G. Wieczorek, I. Antoniuk, M. Kruk, J. Kurek, A. Orłowski, et al. BCT Boost segmentation with U-net in Tensorflow. Machine Graphics and Vision 28(1/4):25-34. 2019. https://doi.org/10.22630/MGV.2019.28.1.3. (Crossref)

Y. Wu and K. He. Group normalization. In: Computer Vision - Proc. ECCV 2018, vol. 11217 of Lecture Notes in Computer Science, pp. 3-19. 2018. https://doi.org/10.1007/978-3-030-01261-8_1. (Crossref)

X. Yang, X. Li, Y. Ye, R. Y. Lau, X. Zhang, et al. Road detection and centerline extraction via deep recurrent convolutional neural network U-Net. IEEE Transactions on Geoscience and Remote Sensing 57(9):7209-7220. 2019. https://doi.org/10.1109/TGRS.2019.2912301. (Crossref)

Z. Yin, B. Wan, F. Yuan, X. Xia, and J. Shi. A deep normalization and convolutional neural network for image smoke detection. Ieee Access 5:18429-18438. 2017. https://doi.org/10.1109/ACCESS.2017.2747399. (Crossref)

S. Zagoruyko and N. Komodakis. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In: Proc. International Conference on Learning Representations (ICLR). 2017. https://openreview.net/forum?id=Sks9_ajex.

W. Zeng. Image data augmentation techniques based on deep learning: A survey. Mathematical Biosciences and Engineering 21(6):6190-6224. 2024. https://doi.org/10.3934/mbe.2024272. (Crossref)

X. Zhang, X. Han, C. Li, X. Tang, H. Zhou, et al. Aerial image road extraction based on an improved generative adversarial network. Remote Sensing 11(8):930. 2019. https://doi.org/10.3390/rs11080930. (Crossref)

H. Zhou, H. Kong, L. Wei, D. Creighton, and S. Nahavandi. Efficient road detection and tracking for unmanned aerial vehicle. IEEE transactions on intelligent transportation systems 16(1):297-309. 2014. https://doi.org/10.1109/TITS.2014.2331353. (Crossref)

H. Zhou and L. Wei. UAV dataset. https://sites.google.com/site/hailingzhouwei/. Unaccessible at present.

H. Zhou and L. Wei. UAV dataset. https://drive.google.com/file/d/1DiQBsm5wmN0fFNnZs6i7oG_SHEo6HLej/.

X.-Y. Zhou, J. Sun, N. Ye, X. Lan, Q. Luo, et al. Batch group normalization. arXiv, arXiv:2012.02782. 2020. https://doi.org/10.48550/arXiv.2012.02782.

Statistics

Downloads

Download data is not yet available.
Recommend Articles