基于CNN-Transformer混合構架的輕量圖像超分辨率方法-AET-電子技術應用

基于CNN-Transformer混合構架的輕量圖像超分辨率方法

網(wǎng)絡安全與數(shù)據(jù)治理

林承浩，吳麗君

福州大學物理與信息工程學院

摘要： 針對基于混合構架的圖像超分模型通常需要較高計算成本的問題，提出了一種基于CNN-Transformer混合構架的輕量圖像超分網(wǎng)絡STSR（Swin Transformer based Single Image Super Resolution）。首先，提出了一種并行特征提取的特征增強模塊（Feature Enhancement Block，F(xiàn)EB），由卷積神經(jīng)網(wǎng)絡（Convolutional Neural Network，CNN）和輕量型Transformer網(wǎng)絡并行地對輸入圖像進行特征提取，再將提取到的特征進行特征融合。其次，設計了一種動態(tài)調(diào)整模塊（Dynamic Adjustment，DA），使得網(wǎng)絡能根據(jù)輸入圖像來動態(tài)調(diào)整網(wǎng)絡的輸出，減少網(wǎng)絡對無關信息的依賴。最后，采用基準數(shù)據(jù)集來測試網(wǎng)絡的性能，實驗結(jié)果表明STSR在降低模型參數(shù)量的前提下仍然保持較好的重建效果。

關鍵詞： 圖像超分辨率輕量化卷積神經(jīng)網(wǎng)絡 Transformer

中圖分類號：TP391文獻標識碼：ADOI:10.19358/j.issn.2097-1788.2024.03.005
引用格式：林承浩，吳麗君.基于CNN-Transformer混合構架的輕量圖像超分辨率方法［J］.網(wǎng)絡安全與數(shù)據(jù)治理，2024，43（3）：27-33.

A lightweight image super resolution method based on a hybrid CNN-Transformer architecture

Lin Chenghao, Wu Lijun

School of Physics and Information Engineering, Fuzhou University

Abstract： In order to address the problem that image super segmentation models based on hybrid architectures usually require high computational cost, this study proposes a lightweight image super segmentation network STSR (Swin Transformer based Single Image Super Resolution) based on a hybrid CNN-Transformer architecture. Firstly, this paper proposes a Feature Enhancement Block (FEB) for parallel feature extraction, which consists of a Convolutional Neural Network (CNN) and a lightweight Transformer Network to extract features from the input image in parallel, and then the extracted features are fused to the features. Secondly, this paper designs a Dynamic Adjustment (DA) module, which enables the network to dynamically adjust the output of the network according to the input image, reducing the network's dependence on irrelevant information. Finally, some benchmark datasets are used to test the performance of the network, and the experimental results show that STSR still maintains a better reconstruction effect under the premise of reducing the number of model parameters.

Key words : image superresolution; lightweighting; Convolutional Neural Network; Transformer

引言

圖像超分辨率(Super Resolution, SR)是一項被廣泛關注的計算機視覺任務，其目的是從低分辨率(Low Resolution, LR)圖像中重建出高質(zhì)量的高分辨率（High Resolution, HR）圖像［1］。由于建出高質(zhì)量的高分辨率圖像具有不適定的性質(zhì)，因此極具挑戰(zhàn)性［2］。隨著深度學習等新興技術的崛起，許多基于卷積神經(jīng)網(wǎng)絡(CNN)的方法被引入到圖像超分任務中［3-6］。SRCNN［3］首次將卷積神經(jīng)網(wǎng)絡引入到圖像超分任務中，用卷積神經(jīng)網(wǎng)絡來學習圖像的特征表示，并通過卷積層的堆疊來逐步提取更高級別的特征，使得重建出的圖像具有較高的質(zhì)量。在后續(xù)研究中，Kaiming He等人提出了殘差結(jié)構ResNet［5］，通過引入跳躍連接，允許梯度能夠跨越層進行傳播，有助于減輕梯度消失的問題，使得模型在較深的網(wǎng)絡情況下仍然能保持較好的性能。Bee Lim等人在EDSR［6］中也引入了殘差結(jié)構，EDSR實際上是SRResnet［7］的改進版，去除了傳統(tǒng)殘差網(wǎng)絡中的BN層，在節(jié)省下來的空間中擴展模型尺寸來增強表現(xiàn)力。RCAN［8］中提出了一種基于Residual in Residual結(jié)構（RIR）和通道注意力機制（CA）的深度殘差網(wǎng)絡。雖然這些模型在當時取得了較好的效果，但本質(zhì)上都是基于CNN網(wǎng)絡的模型，網(wǎng)絡中卷積核的大小會限制可以檢測的空間范圍，導致無法捕捉到長距離的依賴關系，意味著它們只能提取到局部特征，無法獲取全局的信息，不利于紋理細節(jié)的恢復，使得圖像重建的效果不佳［5］。

本文詳細內(nèi)容請下載：

http://forexkbc.com/resource/share/2000005931

作者信息：

林承浩，吳麗君

福州大學物理與信息工程學院，福建福州350108

雜志訂閱.jpg

原創(chuàng)聲明：此內(nèi)容為AET網(wǎng)站原創(chuàng)，未經(jīng)授權禁止轉(zhuǎn)載。

相關內(nèi)容