2.793

                    2018影響因子

                    (CJCR)

                    • 中文核心
                    • EI
                    • 中國科技核心
                    • Scopus
                    • CSCD
                    • 英國科學文摘

                    留言板

                    尊敬的讀者、作者、審稿人, 關于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復。謝謝您的支持!

                    姓名
                    郵箱
                    手機號碼
                    標題
                    留言內容
                    驗證碼

                    基于多維度特征融合的云工作流任務執行時間預測方法

                    李慧芳 黃姜杭 徐光浩 夏元清

                    李慧芳, 黃姜杭, 徐光浩, 夏元清. 基于多維度特征融合的云工作流任務執行時間預測方法. 自動化學報, 2021, x(x): 1?12
                    引用本文: 李慧芳, 黃姜杭, 徐光浩, 夏元清. 基于多維度特征融合的云工作流任務執行時間預測方法. 自動化學報, 2021, x(x): 1?12
                    Li Hui-Fang, Huang Jiang-Hang, Xu Guang-Hao, Xia Yuan-Qing. Multi-dimensional feature fusion based runtime prediction approach for cloud workflow tasks. Acta Automatica Sinica, 2021, x(x): 1?12
                    Citation: Li Hui-Fang, Huang Jiang-Hang, Xu Guang-Hao, Xia Yuan-Qing. Multi-dimensional feature fusion based runtime prediction approach for cloud workflow tasks. Acta Automatica Sinica, 2021, x(x): 1?12

                    基于多維度特征融合的云工作流任務執行時間預測方法

                    基金項目: 國家重點研發計劃項目(2018YFB1003700), 國家自然科學基金項目(61836001)資助
                    詳細信息
                      作者簡介:

                      李慧芳:北京理工大學自動化學院副教授, 主要研究方向為Petri網, 工作流, 云計算, 任務調度, 故障診斷以及深度學習的應用. E-mail: huifang@bit.edu.cn

                      黃姜杭:北京理工大學自動化學院碩士研究生. 主要研究方向為工作流, 云計算, 任務調度. E-mail: 3220190687@bit.edu.cn

                      徐光浩:北京理工大學自動化學院碩士研究生. 主要研究方向為工作流, 云計算, 任務調度. E-mail: 3220200812@bit.edu.cn

                      夏元清:北京理工大學自動化學院教授. 主要研究方向為云控制, 云數據中心優化調度管理, 智能交通, 模型預測控制, 自抗擾控制, 飛行器控制和空天地一體化網絡協同控制. 本文通信作者. E-mail: xia_yuanqing@bit.edu.cn

                    • 中圖分類號: 10.16383/j.aas.c210123

                    Multi-Dimensional Feature Fusion based Runtime Prediction Approach for Cloud Workflow Tasks

                    Funds: Supported by the National Key Research and Development Program of China (2018YFB1003700), the National Natural Science Foundation of China (61836001)
                    More Information
                      Author Bio:

                      LI Hui-Fang Associate Professor at the School of Automation, Beijing Institute of Technology. Her research interests include Petri nets, Workflows, Cloud computing, Task scheduling, Fault diagnosis and applied deep learning

                      HUANG Jiang-Hang Master student at the School of Automation, Beijing Institute of Technology. His research interests include Workflows, Cloud computing, Task scheduling

                      XU Guang-Hao Master student at the School of Automation, Beijing Institute of Technology. His research interests include Workflows, Cloud computing, Task scheduling

                      XIA Yuan-Qing Professor at the School of Automation, Beijing Institute of Technology. His research interest covers cloud control, cloud data center optimization scheduling and management, intelligent transportation, model predictive control, active disturbance rejection control, flight control and networked cooperative control for integration of space, air and earth. Corresponding author of this paper

                    • 摘要: 任務執行時間估計是云數據中心環境下工作流調度的前提. 本文針對現有工作流任務執行時間預測方法缺乏類別型和數值型數據特征的有效提取問題, 提出了基于多維度特征融合的預測方法. 首先, 通過構建具有注意力機制的堆疊殘差循環網絡, 將類別型數據從高維稀疏的特征空間映射到低維稠密的特征空間, 以增強類別型數據的解析能力, 有效提取類別型特征. 其次, 采用極限梯度提升算法對數值型數據進行離散化編碼, 通過對稠密空間的輸入向量進行稀疏化處理, 提高了數值型特征的非線性表達能力. 在此基礎上, 設計多維異質特征融合策略, 將所提取的類別型、數值型特征與樣本的原始輸入特征進行融合, 建立基于多維融合特征的預測模型, 實現了云工作流任務執行時間的精準預測. 為了驗證本文方法的有效性和優越性, 我們在真實云數據中心集群數據集上進行了仿真實驗. 結果表明相對于已有的基準算法, 本文方法具有較好的預測精度, 可用于大數據驅動的云工作流任務執行時間預測.
                    • 圖  1  基于多維度特征融合的云工作流任務執行時間預測模型

                      Fig.  1  Multi-dimensional feature fusion based runtime prediction model for cloud workflow tasks

                      圖  2  基于SARR的類別型特征提取器

                      Fig.  2  The SARR based Categorical feature extractor

                      圖  3  基于XGB的數值型特征提取器

                      Fig.  3  The XGB based Numerical feature extractor

                      圖  4  不同方法的MAE

                      Fig.  4  MAE comparisons among different methods

                      圖  6  不同方法的RMSLE

                      Fig.  6  RMSLE comparisons among different methods

                      圖  7  不同方法的R2

                      Fig.  7  R2 comparisons among different methods

                      圖  5  不同方法的RMSE

                      Fig.  5  RMAE comparisons among different methods

                      表  1  預測精度的差值

                      Table  1  The difference of Prediction performance

                      i$ \delta _i^{MAE}$$\delta _i^{RMSE} $$ \delta _i^{RMSLE}$$\delta _i^{R2} $
                      DIN1.4391.8250.6790.006
                      DCN0.2864.0430.0480.014
                      DeepFM0.3731.8110.0430.009
                      W&D0.8103.5760.1410.012
                      TSA0.9426.4080.0300.025
                      GBDT+LR1.2572.1430.1170.007
                      下載: 導出CSV

                      表  2  預測精度提升的比例(%)

                      Table  2  The proportion of performance improvement (%)

                      i$ \eta _i^{MAE}$$\eta _i^{RMSE} $$ \eta _i^{RMSLE}$$\eta _i^{R2} $
                      DIN36.9422.0682.600.61
                      DCN10.4336.9516.491.43
                      DeepFM13.1818.8515.030.92
                      W&D24.8034.1436.721.22
                      TSA27.7248.1610.992.59
                      GBDT+LR33.8523.7032.500.71
                      下載: 導出CSV
                      360彩票
                    • [1] Versluis L, Iosup A. A survey of domains in workflow scheduling in computing infrastructures: Community and keyword analysis, emerging trends, and taxonomies. Future Generation Computer Systems, 2021, 123: 156-177 doi: 10.1016/j.future.2021.04.009
                      [2] Paknejad P, Khorsand R, Ramezanpour M. Chaotic improved PICEA-g-based multi-objective optimization for workflow scheduling in cloud environment. Future Generation Computer Systems, 2021, 117: 12-28 doi: 10.1016/j.future.2020.11.002
                      [3] Mohammadzadeh A, Masdari M, Gharehchopogh F S, Jafarian A. A hybrid multi-objective metaheuristic optimization algorithm for scientific workflow scheduling. Cluster Computing, 2021, 24(2): 1479-1503 doi: 10.1007/s10586-020-03205-z
                      [4] Chu Z, Yu J, Hamdulla A. A novel deep learning method for query task execution time prediction in graph database. Future Generation Computer Systems, 2020, 112: 534-548 doi: 10.1016/j.future.2020.06.006
                      [5] Pham T, Durillo J J, Fahringer T. Predicting workflow task execution time in the cloud using a two-stage machine learning approach. IEEE Transactions on Cloud Computing, 2020, 8(1): 256-268 doi: 10.1109/TCC.2017.2732344
                      [6] Chirkin A M, Belloum A S Z, Kovalchuk S V, Makkes M X, Melnik M A, Visheratin A A, et al. Execution time estimation for workflow scheduling. Future Generation Computer Systems, 2017, 75(10): 376-387
                      [7] Moreno C, Fischmeister S. Accurate measurement of small execution times-getting around measurement errors. IEEE Embedded Systems Letters, 2017, 9(1): 17-20 doi: 10.1109/LES.2017.2654160
                      [8] Pietri I, Juve G, Deelman E, Sakellariou R. A performance model to estimate execution time of scientific workflows on the cloud. In: Proceedings of the 9th Workshop on Workflows in Support of Large-scale Science. New Orleans, LA, USA: IEEE, 2014. 11−19
                      [9] 許倫凡, 熊敏, 肖永浩. 基于調度歷史數據在線預測作業執行時間. 計算機應用研究, 2018, 37(3): 1-6

                      Xu Lun-Fan, Xiong Min, Xiao Yong-Hao. On-line prediction of application runtime using schedule historical data. Application Research of Computers, 2018, 37(3): 1-6
                      [10] Chirkinab A M, Kovalchuka S V. Towards better workflow execution time estimation. IERI Procedia. 2014, 10: 216-223 doi: 10.1016/j.ieri.2014.09.080
                      [11] Xu H, Li X. Methods for virtual machine scheduling with uncertain execution times in cloud computing. International Journal of Machine Learning and Cybernetics, 2017, 10(6): 325-335
                      [12] Nouri A, Poplavko P, Angelis L, Zerzelidis A, Bensalem S, Katsaros P. Maximal software execution time: A regression-based approach. Innovations in Systems & Software Engineering, 2018, 14(2): 101-116
                      [13] Tahvili S, Afzal W, Saadatmand M, Bohlin M, Ameerjan S H. ESPRET: A tool for execution time estimation of manual test cases. Journal of Systems and Software, 2018, 146: 26-41 doi: 10.1016/j.jss.2018.09.003
                      [14] Park J W, Kim E. Runtime prediction of parallel applications with workload-aware clustering. Journal of Supercomputing, 2017, 73(3): 1-17
                      [15] 鄭婷婷, 陳潔璇, 許洋, 余陽, 潘茂林. 業務流程中一種個性化的任務完成時間預測方法. 計算機集成制造系統, 2019, 25(4): 207-214

                      Zheng Ting-Ting, Chen Jie-Xuan, Xu Yang, Yu Yang, Pan Mao-Lin. Approach for individual task completion time prediction in business processes. Computer Integrated Manufacturing Systems, 2019, 25(4): 207-214
                      [16] 夏元清, 閆策, 王笑京, 宋向輝. 智能交通信息物理融合云控制系統. 自動化學報, 2019, 45(1): 132-142

                      Xia Yuan-Qing, Yan Ce, Wang Xiao-Jing, Song Xiang-Hui. Intelligent transportation cyber-physical cloud control systems. Acta Automatica Sinica, 2019, 45(1): 132-142
                      [17] 范蒼寧, 劉鵬, 肖婷, 趙巍, 唐降龍. 深度域適應綜述: 一般情況與復雜情況. 自動化學報, 2021, 47(3): 515?548

                      Fan Cang-Ning, Liu Peng, Xiao Ting, Zhao Wei, Tang Xiang-Long. A review of deep domain adaptation: general situation and complex situation. Acta Automatica Sinica, 2021, 47(3): 515?548
                      [18] 伍章俊, 劉曉, 倪志偉. 基于混沌時間序列的云工作流活動運行時間預測模型. 計算機集成制造系統, 2013, 19(8): 1920-1927

                      Wu Zhang-Jun, Liu Xiao, Ni Zhi-Wei. Forecasting model for activity durations in cloud workflow based on chaotic time series. Computer Integrated Manufacturing Systems, 2013, 19(8): 1920-1927
                      [19] Nadeem F, Alghazzawi D, Mashat A, Fakeeh K, Almalaise A, Hagras H. Modeling and predicting execution time of scientific workflows in the grid using radial basis function neural network. Cluster Computing, 2017, 20(3): 2805–2819 doi: 10.1007/s10586-017-1018-x
                      [20] Rehse J R, Fettke P. A deep learning approach for predicting process behavior at runtime. In: Proceedings of the 12th International Conference on Business Process Management. Barcelona, Spain: Springer, 2017. 327-338
                      [21] Zhu Z, Fan P. Machine learning based prediction and classification of computational jobs in cloud computing centers. In: Proceedings of the 15th International Wireless Communications & Mobile Computing Conference. Tangier, Morocco: IEEE, 2019. 1482-1487
                      [22] Bi J, Li S, Yuan H, Zhao Z, Liu H. Deep neural networks for predicting task time series in cloud computing systems. In: Proceedings of the 16th International Conference on Networking, Sensing and Control. Banff, AB, Canada: IEEE, 2019. 86-91
                      [23] Dong X, Yu Z, Cao W, Shi Y, Ma Q. A survey on ensemble learning. Frontiers of Computer Science, 2020, 14(2): 241-258 doi: 10.1007/s11704-019-8208-z
                      [24] 鄭顧平, 王秋萍. 基于參數變化的云應用程序執行時間預估方法. 計算機工程與應用, 2017, 53(11): 95-99 doi: 10.3778/j.issn.1002-8331.1605-0047

                      Zheng Gu-Ping, Wang Qiu-Ping. Method for predicting execution time of cloud application based on parametric variation. Computer Engineering and Applications, 2017, 53(11): 95-99 doi: 10.3778/j.issn.1002-8331.1605-0047
                      [25] 李帥標, 趙海燕, 陳慶奎, 曹健. 基于Stacking策略的過程剩余執行時間預測. 小型微型計算機系統, 2019, 40(12): 2481-2486 doi: 10.3969/j.issn.1000-1220.2019.12.001

                      Li Shuai-Biao, Zhao Hai-Yan, Chen Qing-Kui, Cao Jian. Process remaining execution time prediction based on stacking strategy. Journal of Chinese Computer Systems, 2019, 40(12): 2481-2486 doi: 10.3969/j.issn.1000-1220.2019.12.001
                      [26] Nadeem F, Daniyal A, Mashat A, Faqeeh K, Almalaise A. Using machine learning ensemble methods to predict execution time of e-science workflows in heterogeneous distributed systems. IEEE Access, 2019, 7: 25138–25149 doi: 10.1109/ACCESS.2019.2899985
                      [27] Hilman M H, Rodriguez M A, Buyya R. Task runtime prediction in scientific workflows using an online incremental learning approach. In: Proceedings of the 11th IEEE/ACM International Conference on Utility and Cloud Computing. Zurich, Switzerland: IEEE, 2018. 1-8
                      [28] Gao Y, Zhang B, Wang S, Ma A. DBN based cloud service response time prediction method. In: Proceedings of the International Conference on Advanced Communication Technology. Bongpyeong, South Korea: IEEE, 2019. 42-46
                      [29] Pham T P, Durillo J J, Fahringer T. Predicting workflow task execution time in the cloud using a two-stage machine learning approach. IEEE Transactions on Cloud Computing, 2017, 8(1): 1-13
                      [30] Li H, Cao Y, Li S, Zhao J, Sun Y. XGBoost model and its application to personal credit evaluation. IEEE Intelligent Systems, 2020, 35(3): 52-61 doi: 10.1109/MIS.2020.2972533
                      [31] Weng T, Liu W, Xiao J. Supply chain sales forecasting based on lightGBM and LSTM combination model. Industrial Management & Data Systems, 2020, 120(2): 265-279
                      [32] Guo J, Chang Z, Wang S, Ding H, Feng Y, Mao L, Bao Y. Who limits the resource efficiency of my datacenter: an analysis of Alibaba datacenter traces. In: Proceedings of the International Symposium on Quality of Service. New York, USA: ACM, 2019. 1–10
                      [33] Zhou G, Fan Y, Yan Y, Zhu X, Zhu H, Group A, et al. Deep interest network for click-through rate prediction. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. London, United Kingdom: ACM, 2018. 1059–1068
                      [34] Wang R, Fu B, Fu G, Wang M. Deep & cross network for ad click predictions. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax, NS, Canada: ACM, 2017. 1–7
                      [35] Li L, Hong J, Min S, Xue Y. A novel CTR prediction model based on DeepFM for taobao data. In: Proceedings of 2021 IEEE International Conference on Artificial Intelligence and Industrial Design. Guangzhou, China: IEEE, 2021. 184-187
                      [36] Kim M, Lee S, Kim J. A wide & deep learning sharing input data for regression analysis. In: Proceedings of 2020 IEEE International Conference on Big Data and Smart Computing. Busan, Korea: IEEE, 2020. 8-12
                      [37] Christakopoulou K, Beutel A, Li R, Jain S, Chi E H. Q&R: a two-stage approach toward interactive recommendation. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. London, United Kingdom: ACM, 2018. 139–148
                      [38] He X, Pan J, Jin O, Xu T, Liu B, Xu T, et al. Practical lessons from predicting clicks on ads at Facebook. In: Proceedings of the 8th International Workshop on Data Mining for Online Advertising. New York, USA: ACM, 2014. 1–9
                      [39] 趙文迪, 陳德旺, 卓永強, 黃允滸. 深度神經模糊系統算法及其回歸應用. 自動化學報, 2020, 46(11): 2350?2358

                      Zhao Wen-Di, Chen De-Wang, Zhuo Yong-Qiang, Huang Yun-Hu. Deep neural fuzzy system algorithm and its regression application. Acta Automatica Sinica, 2020, 46(11): 2350?2358
                      [40] Gupta R, Pandey G, Chaudhary P, Pal S K. Machine learning models for government to predict COVID-19 outbreak. Digital Government Research and Practice, 2020, 1(4): 1-6
                    • 加載中
                    計量
                    • 文章訪問數:  82
                    • HTML全文瀏覽量:  46
                    • 被引次數: 0
                    出版歷程
                    • 收稿日期:  2021-02-25
                    • 錄用日期:  2021-06-24
                    • 網絡出版日期:  2021-08-12

                    目錄

                      /

                      返回文章
                      返回