Perceptually optimised Swin-Unet for low-light image enhancement

Tomasz M. Lehmann; Przemysław Rokita

doi:10.22630/MGV.2025.34.4.2

PDF

Published:

Nov 12, 2025

Issue

Vol. 34 No. 4 (2025)

Section

Original research papers

CitedBy/Share

Tomasz M. Lehmann

Warsaw University of Technology; Warsaw; Poland

https://orcid.org/0000-0002-8576-5357

Przemysław Rokita

Warsaw University of Technology; Warsaw; Poland

https://orcid.org/0000-0002-4433-2133

Knowledge base

DOI: https://doi.org/10.22630/MGV.2025.34.4.2

Keywords : low-light image enhancement, U-Net, mean opinion score, LPIPS

Abstract

In this paper we propose a novel approach to low-light image enhancement using a transformer-based Swin-Unet and a perceptually driven loss that incorporates Learned Perceptual Image Patch Similarity (LPIPS), a deep-feature distance aligned with human visual judgements. Specifically, our U-shaped Swin-Unet applies shifted-window self-attention across scales with skip connections and multi-scale fusion, mapping a low-light RGB image to its enhanced version in one pass. Training uses a compact objective - Smooth-L₁, LPIPS (AlexNet), MS-SSIM (detached), inverted PSNR, channel-wise colour consistency, and Sobel-gradient terms - with a small LPIPS weight chosen via ablation. Our work addresses the limits of purely pixel-wise losses by integrating perceptual and structural components to produce visually superior results. Experiments on LOL-v1, LOL-v2, and SID show that while our Swin-Unet does not surpass current state-of-the-art on standard metrics, the LPIPS-based loss significantly improves perceptual quality and visual fidelity. These results confirm the viability of transformer-based U-Net architectures for low-light enhancement, particularly in resource-constrained settings, and suggest exploring larger variants and further tuning of loss parameters in future work.

How to Cite

Lehmann, T. M., & Rokita, P. (2025). Perceptually optimised Swin-Unet for low-light image enhancement. Machine Graphics & Vision, 34(4), 23–42. https://doi.org/10.22630/MGV.2025.34.4.2

References

A. Brateanu, R. Balmez, A. Avram, C. Orhei, and C. Ancuti. LYT-NET: Lightweight YUV transformer-based network for low-light image enhancement. IEEE Signal Processing Letters 32:2065-2069, 2025. https://doi.org/10.1109/LSP.2025.3563125. (Crossref)

Y. Cai, H. Bian, J. Lin, H. Wang, R. Timofte, et al. Retinexformer: One-stage Retinex-based transformer for low-light image enhancement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 12504-12513, 2023. https://doi.org/10.1109/ICCV51070.2023.01149. (Crossref)

H. Cao, Y. Wang, J. Chen, D. Jiang, X. Zhang, et al. Swin-Unet: Unet-like pure transformer for medical image segmentation. In: Computer Vision - ECCV 2022 Workshops, vol. 13803 of Lecture Notes in Computer Science, 2023. https://doi.org/10.1007/978-3-031-25066-8_9. (Crossref)

C. Chen, Q. Chen, J. Xu, and V. Koltun. Learning to see in the dark. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), pp. 3291-3300, 2018. https://doi.org/10.1109/CVPR.2018.00347. (Crossref)

H. Chen, Y. Wang, T. Guo, C. Xu, Y. Deng, et al. Pre-trained image processing transformer. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021), pp. 12299-12310, 2021. https://doi.org/10.1109/CVPR46437.2021.01212. (Crossref)

Z. Cui, K. Li, L. Gu, S. Su, P. Gao, et al. You only need 90k parameters to adapt light: a light weight transformer for image enhancement and exposure correction. In: 33rd British Machine Vision Conference (BMVC 2022), 2022. https://bmvc2022.mpi-inf.mpg.de/238/.

C.-M. Fan, T.-J. Liu, and K.-H. Liu. Half wavelet attention on M-Net+ for low-light image enhancement. In: 2022 IEEE International Conference on Image Processing (ICIP 2022), pp. 3878-3882, 2022. https://doi.org/10.1109/ICIP46576.2022.9897503. (Crossref)

Y. Feng, C. Zhang, P. Wang, P. Wu, Q. Yan, et al. You only need one color space: An efficient network for low-light image enhancement. arXiv, arXiv:2402.05809, 2024. https://doi.org/10.48550/arXiv.2402.05809.

X. Fu, Y. Liao, D. Zeng, Y. Huang, X.-P. Zhang, et al. A probabilistic method for image enhancement with simultaneous illumination and reflectance estimation. IEEE Transactions on Image Processing 24(12):4965-4977, 2015. https://doi.org/10.1109/TIP.2015.2474701. (Crossref)

Z. Gu, F. Li, F. Fang, and G. Zhang. A novel Retinex-based fractional-order variational model for images with severely low light. IEEE Transactions on Image Processing 29:7233-7247, 2020. https://doi.org/10.1109/TIP.2019.2958144. (Crossref)

C. Guo, C. Li, J. Guo, C. C. Loy, J. Hou, et al. Zero-reference deep curve estimation for low-light image enhancement. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 1780-1789, 2020. https://doi.org/10.1109/CVPR42600.2020.00185. (Crossref)

X. Guo and Q. Hu. Low-light image enhancement via breaking down the darkness. International Journal of Computer Vision 131:48–66, 2023. https://doi.org/10.1007/s11263-022-01667-9. (Crossref)

H. Hou, Y. Hou, Y. Shi, B. Wei, and J. Xu. NLHD: A pixel-level non-local Retinex model for low-light image enhancement. arXiv, arXiv:2106.06971, 2021. https://doi.org/10.48550/arXiv.2106.06971.

J. H. Jang, Y. Bae, and J. B. Ra. Contrast-enhanced fusion of multisensor images using subband-decomposed multiscale Retinex. IEEE Transactions on Image Processing 21(8):3479-3490, 2012. https://doi.org/10.1109/TIP.2012.2197014. (Crossref)

Y. Jiang, X. Gong, D. Liu, Y. Cheng, C. Fang, et al. EnlightenGAN: Deep light enhancement without paired supervision. IEEE Transactions on Image Processing 30:2340-2349, 2021. https://doi.org/10.1109/TIP.2021.3051462. (Crossref)

A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25 (NIPS 2012), vol. 25, pp. 1097-1105, 2012. https://proceedings.neurips.cc/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html.

E. H. Land. The Retinex theory of color vision. Scientific American 237(6):108-128, 1977. https://doi.org/10.1038/scientificamerican1277-108. (Crossref)

J. Li, J. Li, F. Fang, F. Li, and G. Zhang. Luminance-aware pyramid network for low-light image enhancement. IEEE Transactions on Multimedia 23:3153-3165, 2021. https://doi.org/10.1109/TMM.2020.3021243. (Crossref)

R. Liu, L. Ma, J. Zhang, X. Fan, and Z. Luo. Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2021), pp. 10561-10570, 2021. https://doi.org/10.1109/CVPR46437.2021.01042. (Crossref)

Y. Liu, T. Huang, W. Dong, F. Wu, X. Li, et al. Low-light image enhancement with multi-stage residue quantization and brightness-aware attention. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV 2023), pp. 12106-12115, 2023. https://doi.org/10.1109/ICCV51070.2023.01115. (Crossref)

Z. Liu, H. Hu, Y. Lin, Z. Yao, Z. Xie, et al. Swin transformer V2: Scaling up capacity and resolution. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), pp. 11999-12009, 2022. https://doi.org/10.1109/CVPR52688.2022.01170. (Crossref)

A. Mittal, A. K. Moorthy, and A. C. Bovik. No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing 21(12):4695-4708, 2012. https://doi.org/10.1109/TIP.2012.2214050. (Crossref)

A. Mittal, R. Soundararajan, and A. C. Bovik. Making a ``completely blind'' image quality analyzer. IEEE Signal Processing Letters 20(3):209-212, 2013. https://doi.org/10.1109/LSP.2012.2227726. (Crossref)

S. Moran, P. Marza, S. McDonagh, S. Parisot, and G. Slabaugh. DeepLPF: Deep Local Parametric Filters for image enhancement. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 12823-12832, 2020. https://doi.org/10.1109/CVPR42600.2020.01284. (Crossref)

M. Noyan, A. R. Gostipathy, R. Wightman, and P. Cuenca. timm PyTorch Image Models. In: Hugging Face, 2025. https://huggingface.co/timm.

NVIDIA Corporation. NVIDIA cuDNN. In: NVIDIA DEVELOPER, 2025. https://developer.nvidia.com/cudnn.

S. Park, S. Yu, B. Moon, S. Ko, and J.-I. Paik. Low-light image enhancement using variational optimization-based Retinex model. IEEE Transactions on Consumer Electronics 63(2):178-184, 2017. https://doi.org/10.1109/TCE.2017.014847. (Crossref)

PyTorch. Previous PyTorch Versions, 2025. https://pytorch.org/get-started/previous-versions/.

A. Rogozhnikov. Einops: Clear and reliable tensor manipulations with Einstein-like notation. In: International Conference on Learning Representations (ICLR 2022), 2022. https://openreview.net/forum?id=oapKSVM2bcj.

A. Rogozhnikov. einops, 2025. https://einops.rocks/.

H. Shakibania, S. Raoufi, and H. Khotanlou. CDAN: Convolutional dense attention-guided network for low-light image enhancement. Digital Signal Processing 156:104802, 2025. https://doi.org/10.1016/j.dsp.2024.104802. (Crossref)

A. Wang, Y. Li, J. Peng, Y. Ma, X. Wang, et al. Real-time image enhancer via learnable spatial-aware 3D lookup tables. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV 2021), pp. 2451-2460, 2021. https://doi.org/10.1109/ICCV48922.2021.00247. (Crossref)

R. Wang, Q. Zhang, C.-W. Fu, X. Shen, W.-S. Zheng, et al. Underexposed photo enhancement using deep illumination estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019), pp. 6842-6850, 2019. https://doi.org/10.1109/CVPR.2019.00701. (Crossref)

T. Wang, K. Zhang, T. Shen, W. Luo, B. Stenger, et al. Ultra-high-definition low-light image enhancement: A benchmark and transformer-based method. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2023), vol. 37 no. 3, pp. 2654-2662, 2023. https://doi.org/10.1609/aaai.v37i3.25364. (Crossref)

Y. Wang, R. Wan, W. Yang, H. Li, L.-P. Chau, et al. Low-light image enhancement with normalizing flow. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2022), vol. 36 no. 3, pp. 2604-2612, 2022. https://doi.org/10.1609/aaai.v36i3.20162. (Crossref)

Z. Wang, X. Cun, J. Bao, W. Zhou, J. Liu, et al. Uformer: A general U-shaped transformer for image restoration. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), pp. 17662-17672, 2022. https://doi.org/10.1109/CVPR52688.2022.01716. (Crossref)

C. Wei, W. Wang, W. Yang, and J. Liu. Deep Retinex decomposition for low-light enhancement. In: Proceedings of the British Machine Vision Conference (BMVC 2018), 2018. https://bmva-archive.org.uk/bmvc/2018/contents/papers/0451.pdf.

J. Wen, C. Wu, T. Zhang, Y. Yu, and P. Swierczynski. Self-reference deep adaptive curve estimation for low-light image enhancement. arXiv, arXiv:2308.08197, 2023. https://doi.org/10.48550/arXiv.2308.08197.

K. Xu, X. Yang, B. Yin, and R. W. H. Lau. Learning to restore low-light images via decomposition-and-enhancement. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020), pp. 2278-2287, 2020. https://doi.org/10.1109/CVPR42600.2020.00235. (Crossref)

X. Xu, R. Wang, C.-W. Fu, and J. Jia. Snr-aware low-light image enhancement. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), pp. 17737-17747, 2022. https://doi.org/10.1109/CVPR52688.2022.01719. (Crossref)

W. Yang, S. Wang, Y. Fang, Y. Wang, and J. Liu. Band representation-based semi-supervised low-light image enhancement: Bridging the gap between signal fidelity and perceptual quality. IEEE Transactions on Image Processing 30:3461-3473, 2021. https://doi.org/10.1109/TIP.2021.3062184. (Crossref)

W. Yang, W. Wang, H. Huang, S. Wang, and J. Liu. Sparse gradient regularized deep Retinex network for robust low-light image enhancement. IEEE Transactions on Image Processing 30:2072-2086, 2021. https://doi.org/10.1109/TIP.2021.3050850. (Crossref)

X. Yi, H. Xu, H. Zhang, L. Tang, and J. Ma. Diff-Retinex: Rethinking low-light image enhancement with a generative diffusion model. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6725-6735, 2023. https://doi.org/10.1109/ICCV51070.2023.01130. (Crossref)

D. You, J. Tao, Y. Zhang, and M. Zhang. Low-light image enhancement based on gray scale transformation and improved Retinex. Infrared Technology (Hongwai Jishu) 45(2):161-170, 2023.

S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, et al. Learning enriched features for real image restoration and enhancement. In: European Conference on Computer Vision (ECCV 2020), vol. 12370 of Lecture Notes in Computer Science, pp. 492-511, 2020. https://doi.org/10.1007/978-3-030-58595-2_30. (Crossref)

S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, et al. Restormer: Efficient transformer for high-resolution image restoration. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), pp. 5718-5729, 2022. https://doi.org/10.1109/CVPR52688.2022.00564. (Crossref)

H. Zeng, J. Cai, L. Li, Z. Cao, and L. Zhang. Learning image-adaptive 3d lookup tables for high performance photo enhancement in real-time. IEEE Transactions on Pattern Analysis and Machine Intelligence 44(4):2058-2073, 2022. https://doi.org/10.1109/TPAMI.2020.3005590. (Crossref)

R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang. The unreasonable effectiveness of deep features as a perceptual metric. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), pp. 586-595, 2018. https://doi.org/10.1109/CVPR.2018.00068. (Crossref)

Y. Zhang, X. Guo, J. Ma, W. Liu, and J. Zhang. Beyond brightening low-light images. International Journal of Computer Vision 129(4):1013-1037, 2021. https://doi.org/10.1007/s11263-020-01407-x. (Crossref)

Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu. Residual dense network for image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence 43(7):2480-2495, 2021. https://doi.org/arXiv:1812.10477. (Crossref)

Y. Zhang, J. Zhang, and X. Guo. Kindling the darkness: A practical low-light image enhancer. In: Proceedings of the ACM International Conference on Multimedia (ACM MM), pp. 1632-1640, 2019. https://doi.org/10.1145/3343031.3350926. (Crossref)

D. Zhou, Z. Yang, and Y. Yang. Pyramid diffusion models for low-light image enhancement. arXiv, arXiv:2305.10028, 2023. https://doi.org/10.48550/arXiv.2305.10028. (Crossref)

S. Zhou, C. Li, and C. C. Loy. LEDNet: Joint low-light enhancement and deblurring in the dark. In: European Conference on Computer Vision (ECCV), vol. 13666 of Lecture Notes in Computer Science, pp. 573-589, 2022. https://doi.org/10.1007/978-3-031-20068-7_33. (Crossref)

Statistics

Downloads

Download data is not yet available.

Recommend Articles

Most read articles by the same author(s)

Tomasz M. Lehmann, Attention-based U-Net for image demoiréing , Machine Graphics & Vision: Vol. 31 No. 1/4 (2022)
Tomasz M. Lehmann, Przemysław Rokita, Xception-based architecture with cross-sampled training for Image Quality Assessment on KonIQ-10k , Machine Graphics & Vision: Vol. 32 No. 2 (2023)
Justyna S. Stypułkowska, Przemysław Rokita, Classification of maize growth stages using deep neural networks with voting classifier , Machine Graphics & Vision: Vol. 33 No. 3/4 (2024)

Article Sidebar

Main Article Content

Article Details

Downloads

policy Privacy Policy