2.793

                    2018影響因子

                    (CJCR)

                    • 中文核心
                    • EI
                    • 中國科技核心
                    • Scopus
                    • CSCD
                    • 英國科學文摘

                    留言板

                    尊敬的讀者、作者、審稿人, 關于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復。謝謝您的支持!

                    姓名
                    郵箱
                    手機號碼
                    標題
                    留言內容
                    驗證碼

                    多層局部塊坐標下降法及其驅動的分類重構網絡

                    王金甲 張玉珍 夏靜 王鳳嬪

                    王金甲, 張玉珍, 夏靜, 王鳳嬪. 多層局部塊坐標下降法及其驅動的分類重構網絡. 自動化學報, 2020, 46(12): 2647?2661 doi: 10.16383/j.aas.c190540
                    引用本文: 王金甲, 張玉珍, 夏靜, 王鳳嬪. 多層局部塊坐標下降法及其驅動的分類重構網絡. 自動化學報, 2020, 46(12): 2647?2661 doi: 10.16383/j.aas.c190540
                    Wang Jin-Jia, Zhang Yu-Zhen, Xia Jing, Wang Feng-Pin. Multi-layer local block coordinate descent algorithm and unfolding classification and reconstruction networks. Acta Automatica Sinica, 2020, 46(12): 2647?2661 doi: 10.16383/j.aas.c190540
                    Citation: Wang Jin-Jia, Zhang Yu-Zhen, Xia Jing, Wang Feng-Pin. Multi-layer local block coordinate descent algorithm and unfolding classification and reconstruction networks. Acta Automatica Sinica, 2020, 46(12): 2647?2661 doi: 10.16383/j.aas.c190540

                    多層局部塊坐標下降法及其驅動的分類重構網絡

                    doi: 10.16383/j.aas.c190540
                    基金項目: 國家自然科學基金(61473339), 首批“河北省青年拔尖人才” ([2013]17)資助項目, 京津冀基礎研究合作專項 (F2019203583)
                    詳細信息
                      作者簡介:

                      王金甲:燕山大學信息科學與工程學院教授. 主要研究方向為信號處理和模式識別. 本文通信作者. E-mail: wjj@ysu.edu.cn

                      張玉珍:燕山大學信息科學與工程學院碩士研究生. 主要研究方向為信號與信息處理. E-mail: 13091375387@163.com

                      夏靜:燕山大學信息科學與工程學院碩士研究生. 主要研究方向為信號處理. E-mail: xiajing_527@sina.com

                      王鳳嬪:燕山大學信息科學與工程學院博士研究生. 主要研究方向為模式識別. E-mail: landywang1105@163.com

                    Multi-layer Local Block Coordinate Descent Algorithm and Unfolding Classification and Reconstruction Networks

                    Funds: Supported by National Natural Science Foundation of China (61473339), The First Batch of “Top Young Talents in Hebei Province” ([2013]17), Basic Research Cooperation Projects of Beijing, Tianjin and Hebei (F2019203583)
                    • 摘要: 卷積稀疏編碼(Convolutional sparse coding, CSC)已廣泛應用于信號或圖像處理、重構和分類等任務中, 基于深度學習思想的多層卷積稀疏編碼(Multi-layer convolutional sparse coding, ML-CSC)模型的多層基追蹤(Multi-layer basic pursuit, ML-BP)問題和多層字典學習問題成為研究熱點. 但基于傅里葉域的交替方向乘子法(Alternating direction multiplier method, ADMM)求解器和基于圖像塊(Patch)空間域思想的傳統基追蹤算法不能容易地擴展到多層情況. 在切片(Slice)局部處理思想的基礎上, 本文提出了一種新的多層基追蹤算法: 多層局部塊坐標下降(Multi-layer local block coordinatedescent, ML-LoBCoD)算法. 在多層迭代軟閾值算法(Multi-layer iterative soft threshold algorithm, ML-ISTA)和對應的迭代展開網絡ML-ISTA-Net 的啟發下, 提出了對應的迭代展開網絡ML-LoBCoD-Net. ML-LoBCoD-Net實現信號的表征學習功能, 輸出的最深層卷積稀疏編碼用于分類. 此外, 為了獲得更好的信號重構, 本文提出了一種新的多層切片卷積重構網絡(Multi-layer slice convolutional reconstruction network, ML-SCRN), ML-SCRN實現從信號稀疏編碼到信號重構. 我們對這兩個網絡分別進行實驗驗證. 然后將ML-LoBCoD-Net和ML-SCRN 進行級聯得到ML-LoBCoD-SCRN合并網, 同時實現圖像的分類和重構. 與傳統基于全連接層對圖像進行重建的方法相比, 本文提出的ML-LoBCoD-SCRN合并網所需參數少, 收斂速度快, 重構精度高. 本文將ML-ISTA和多層快速迭代軟閾值算法(Multi-layer fast iterative soft threshold algorithm, ML-FISTA) 構建為ML-ISTA-SCRN和ML-FISTA-SCRN進行對比實驗, 初步證明了所提出的ML-LoBCoD-SCRN分類重構網在MNIST、CIFAR10和CIFAR100數據集上是有效的, 分類準確率、損失函數和信號重構結果都優于ML-ISTA-SCRN和ML-FISTA-SCRN.
                    • 圖  1  優化算法三次迭代展開的ML-LoBCoD-Net

                      Fig.  1  The unfolding ML-LoBCoD-Net based on three iterations of the optimization algorithm

                      圖  2  ML-LoBCoD 算法第 j 層一次迭代流程圖

                      Fig.  2  The flowchart of one iteration of the jth-layer of the ML-LoBCoD algorithm.

                      圖  3  j 層切片卷積層

                      Fig.  3  The jth-layer of the slice convolution layer

                      圖  4  基于切片多層卷積重構神經網絡(ML-SCRN)

                      Fig.  4  The multi-layer slice convolutional reconstruction network (ML-SCRN)

                      圖  5  ML-LoBCoD-SCRN 分類重構網絡

                      Fig.  5  ML-LoBCoD-SCRN classification reconstruction network

                      圖  6  ML-LoBCoD-SCRN和LoBCoD-SCRN在MNIST數據集中的分類準確率

                      Fig.  6  Classification accuracy of ML-LoBCoD-SCRN and LoBCoD-SCRN in the MNIST data

                      圖  7  ML-LoBCoD-SCRN和LoBCoD-SCRN在MNIST數據集中的損失函數值隨迭代次數的變化

                      Fig.  7  Loss function value of ML-LoBCoD-SCRN and LoBCoD-SCRN in the MNIST data with the number of iterations

                      圖  8  $\rho $在不同值時ML-LoBCoD-SCRN在MNIST數據集中的分類準確率

                      Fig.  8  Classification accuracy of ML-LoBCoD-SCRN in the MNIST data at different values of $ \rho $

                      圖  9  $ \rho $在不同值時ML-LoBCoD-SCRN重構結果

                      Fig.  9  Reconstruction results of ML-LoBCoD-SCRN at different values of $ \rho $

                      圖  10  $ \rho $在不同值時ML-LoBCoD-SCRN損失函數值隨迭代次數的變化

                      Fig.  10  Loss function value of ML-LoBCoD-SCRN with iterations at different values of $ \rho $

                      圖  11  $ \rho = 0 $時三種方法的分類準確率

                      Fig.  11  Classification accuracy of the three methods at $ \rho = 0 $

                      圖  12  ML-SCRN網絡的重構結果

                      Fig.  12  Reconstruction results of ML-SCRN network

                      圖  13  逐層松弛模型的分類對比圖

                      Fig.  13  Classification comparison chart of layer-by-layer relaxation model

                      圖  14  逐層松弛模型重構損失函數對比圖

                      Fig.  14  Comparison chart of the reconstruction loss function value of the layer-by-layer relaxation model

                      圖  15  逐層松弛 ML-LoBCoD-SCRN 模型重構結果對比圖

                      Fig.  15  Comparison chart of the reconstruction images results of the layer-by-layer relaxation ML-LoBCoD-SCRN model

                      圖  16  兩種分類重構網絡在MNIST數據集的重構結果

                      Fig.  16  Reconstruction results of two classification reconstruction networks in the MNIST data

                      圖  17  兩種分類重構網在MNIST 數據集的損失函數

                      Fig.  17  Loss function value of two classification reconstruction networks in the MNIST data

                      圖  18  三種網絡在MNIST數據集下的重構結果

                      Fig.  18  Reconstruction results of three networks of the MNIST data

                      圖  20  三種網絡在CIFAR100數據集下的重構結果

                      Fig.  20  Reconstruction results of three networks of the CIFAR100 data

                      圖  19  三種網絡在CIFAR10數據集下的重構結果

                      Fig.  19  Reconstruction results of three networks of the CIFAR10 data

                      圖  21  三種分類重構網在MNIST 數據集下的分類準確率

                      Fig.  21  Classification accuracy of three classification reconstruction networks in the MNIST data

                      圖  22  三種分類重構網在CIFAR10數據集下的分類準確率

                      Fig.  22  Classification accuracy of three classification reconstruction networks in the CIFAR10 data

                      圖  23  三種分類重構網在MNIST 數據集下的損失函數

                      Fig.  23  Loss function value of three classification reconstruction networks in the MNIST data

                      圖  24  三種分類重構網在CIFAR10數據集下的損失函數

                      Fig.  24  Loss function value of three classification reconstruction networks in the CIFAR10 data

                      表  1  幾種分類網絡在迭代100次時的分類準確率 (%)

                      Table  1  Classification accuracy of several classification networks at 100 iterations (%)

                      模型 ACC (MNIST) ACC (CIFAR10)
                      CNN 98.74 79.00
                      ML-ISTA 99.11 82.93
                      ML-FISTA 99.16 82.79
                      ML-LISTA 98.80 82.68
                      LBP 99.19 80.73
                      ML-LoBCoD 99.15 85.53
                      下載: 導出CSV

                      表  2  兩種分類重構網絡在迭代100次對比

                      Table  2  Comparison of two classification reconstruction networks over 100 iterations

                      模型 ML-LoBCoD-SCRN ML-LoBCoD-FC
                      分類準確率ρ=0 (%) 99.15 98.91
                      重構誤差 3.03×10?6 1.38×10?5
                      平均峰值信噪比 (dB) 30.77 22.79
                      時間 1 h 47 m 2 h 34 m
                      下載: 導出CSV

                      表  3  兩種分類重構網絡參數數量的比較

                      Table  3  Comparison of the parameters of two classification reconstruction networks

                      模型 ML-LoBCoD-SCRN ML-LoBCoD-FC
                      1st layer 6×6×1×64+64 6×6×1×64+64
                      2nd layer 6×6×64×128+128 6×6×64×128+128
                      3rd layer 4×4×128×512+512 4×4×128×512+512
                      4th layer 512×10+10 512×10+512×784+784
                      Total 1 352 330 1 753 818
                      下載: 導出CSV

                      表  4  三種網絡在MNIST、CIFAR10和CIFAR100數據集下迭代100次各參數對比

                      Table  4  Comparison of the parameters of the three networks under the MNIST, CIFAR10 and CIFAR100 datasets 100 times

                      模型 分類準確率 (%) 重構誤差 ( × 10?6 ) 運行時間 平均峰值信噪比 (dB)
                      MNIST CIFAR10 CINAR100 MNIST CIFAR10 CINAR100 MNIST CIFAR10 CINAR100 MNIST CIFAR10 CINAR100
                      ML-LoBCoD-SCRN 98.90 84.40 83.41 3.03 1.44 3.14 1 h 47 m 1 h 12 m 0 h 57 m 30.77 32.46 29.97
                      ML-ISTA-SCRN 98.65 82.62 81.26 3.87 5.28 6.75 1 h 54 m 1 h 20 m 1 h 00 m 26.51 28.21 27.00
                      ML-FISTA-SCRN 98.41 83.48 80.34 3.42 6.52 8.95 1 h 56 m 1 h 25 m 1 h 05 m 29.75 27.63 25.14
                      下載: 導出CSV
                      360彩票
                    • [1] Aharon M, Elad M, and Bruckstein A, K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 2006, 54(11): 4311?4322
                      [2] Rey-Otero I, Sulam J, and Elad M. Variations on the convolutional sparse coding model. IEEE Transactions on Signal Processing, 2020, 68(1): 519?528
                      [3] Lecun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553): 436?444
                      [4] Bristow H, Eriksson A, Lucey S. Fast convolutional sparse coding. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013, 391−398
                      [5] 陳善雄, 熊海靈, 廖劍偉, 周駿, 左俊森. 一種基于 CGLS 和 LSQR 的聯合優化的匹配追蹤算法. 自動化學報, 2018, 44(7): 1293?1303

                      Chan Shan-Xiong, Xiong Hai-Ling, Liao Jian-Wei, Zhou Jun, Zuo Jun-Sen. A joint optimized matching tracking algorithm based on CGLS and LSQR. Acta Automatica Sinica, 2018, 44(7): 1293?1303
                      [6] Heide F, Heidrich W, Wetzstein G. Fast and flexible convolutional sparse coding. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, 2015, 5135?5143
                      [7] Papyan V, Romano Y, Sulam J, Elad M. Convolutional dictionary learning via local processing. In: Proceedings of the 16th IEEE International Conference on Computer Vision (ICCV) , 2017, 5306–5314
                      [8] Zisselman E, Sulam J, Elad M. A local block coordinate descent algorithm for the CSC model. In: Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, 2019, 8200?8209
                      [9] Papyan V, Romano Y, Elad M. Convolutional neural networks analyzed via convolutional sparse coding. The Journal of Machine Learning Research, 2017, 18(1): 2887?2938
                      [10] 張芳, 王萌, 肖志濤, 吳駿, 耿磊, 童軍, 王雯. 基于全卷積神經網絡與低秩稀疏分解的顯著性檢測. 自動化學報, 2019, 45(11):2148?2158

                      Zhang Fang, Wang Meng, Xiao Zhi-Tao, Wu Jun, Geng Lei, Tong Jun, Wang Wen. Saliency detection based on full convolutional neural network and low rank sparse decomposition.Acta Automatica Sinica, 2019, 45(11): 2148?2158
                      [11] Sulam J, Papyan V, Romano Y, Elad M. Multi-layer convolutional sparse modeling: Pursuit and dictionary learning. IEEE Transactions on Signal Processing, 2018, 65(15): 4090?4104
                      [12] Sulam J, Aberdam A, Beck A, Elad M. On multi-layer basis pursuit, efficient algorithms and convolutional neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 1968?1980
                      [13] Aberdam A, Sulam J, Elad M. Multi-layer sparse coding: the holistic way. SIAM Journal on Mathematics of Data Science, 2019, 1(1): 46?77
                      [14] 常亮, 鄧小明, 周明全, 武仲科, 袁野, 楊碩, 王宏安. 圖像理解中的卷積神經網絡. 自動化學報, 2016, 42(9): 1300?1312

                      Chang Liang, Deng Xiao-Ming, Zhou Ming-Quan, Wu Zhong-Ke, Yuan Ye, Yang Shuo, Wang Hong-An. Convolution neural network in image understanding. Acta Automatica Sinica, 2016, 42(9): 1300?1312
                      [15] Badrinarayanan V, Kendall A, and Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481?2495
                      [16] Elad P, Raja G. Matching pursuit based convolutional sparse coding. In: Proceedings of the 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2018, 6847−6851.
                      [17] Wohlberg B. Effificient algorithms for convolutional sparse representations. IEEE Transactions on Image Processing, 2016, 25(1): 301?315
                      [18] Sreter H, Giryes R. Learned convolutional sparse coding. In: Proceedings of the 2018 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2018, 2191?2195
                      [19] Liu J, Garcia-Cardona C, Wohlberg B, Yin W. First and second order methods for online convolutional dictionary learning. SIAM Journal on Imaging Sciences, 2018: 1589?1628
                      [20] Garcia-Cardona C, Wohlberg B. Convolutional dictionary learning: A comparative review and new algorithms. IEEE Transactions on Computational Imaging, 2018, 4(3): 366?381
                      [21] Peng G J. Joint and direct optimization for dictionary learning in convolutional sparse representation. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(2):559?573
                      [22] Papyan V, Sulam J, and Elad M. Working locally thinking globally: Theoretical guarantees for convolutional sparse coding. IEEE Transactions on Signal Processing, 2017, 65(21): 5687?5701
                    • 加載中
                    圖(24) / 表(4)
                    計量
                    • 文章訪問數:  861
                    • HTML全文瀏覽量:  1229
                    • PDF下載量:  72
                    • 被引次數: 0
                    出版歷程
                    • 收稿日期:  2019-07-19
                    • 錄用日期:  2019-12-23
                    • 網絡出版日期:  2020-01-17
                    • 刊出日期:  2020-12-29

                    目錄

                      /

                      返回文章
                      返回