«上一篇/Previous Article|本期目录/Table of Contents|下一篇/Next Article»

j.cnki.issn1000-5862.2021.01.09]
点击复制

基于Fibonacci采样的数据预处理方法研究()

分享到：

《江西师范大学学报》（自然科学版）[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:: 2021年01期

页码:: 60-66

栏目:

出版日期:: 2021-02-10

文章信息/Info

Title:: The Study on Data Preprocessing Method Based on Fibonacci Sampling

文章编号:: 1000-5862(2021)01-0060-07

作者:: 邓泓¹; 3; 刘志超²; 彭莹琼¹; 3*; 舒晴¹; 何火娇¹; 1.江西农业大学软件学院,江西南昌 330045; 2.江西农业大学计算机与信息工程学院,江西南昌 330045; 3.江西农业大学江西省农业信息技术重点实验室,江西南昌 330045

Author(s):: DENG Hong¹; 3; LIU Zhichao²; PENG Yingqiong¹; 3*; SHU Qing¹; HE Huojiao¹; 1.Software College,Jiangxi Agricultural University,Nanchang Jiangxi 330045,China; 2.Computer and Information Engineering,Jiangxi Agricultural University,Nanchang Jiangxi 330045,China; 3.Key Laboratory of Agricultural Information Technology of Jiangxi Prov

关键词:: 小批量; Fibonacci采样; 神经网络

Keywords:: mini-batch; Fibonacci sampling; neural network

分类号:: TP 389.1

DOI:: 10.16357/j.cnki.issn1000-5862.2021.01.09

文献标志码:: A

摘要:: 提高神经网络(NN)的收敛速率和预测精度一直是人工智能领域的一个挑战性问题,尽管许多研究人员已在研究中使用小批量数据训练神经网络获得了较好的效果,但是这些方法并不够灵活.针对这个问题,该文提出了一种新的数据预处理算法即Fibonacci采样算法.根据Fibonacci数列规则绘制一个新的训练数据序列,这不仅可以恢复小批量数据的划分,而且还可以提供更灵活的批量规模的选择.实验结果表明:在梯度下降之前,Fibonacci数据划分序列能得到较好的实验结果.在应用于相同的单个CNN(5层卷积神经网络)和AlexNet中,Fibonacci采样算法比传统的小批量梯度下降算法能获得更高的准确度和更低的损失值,并且在几种通用网络(LeNet、AlexNet、VGG-16、GoogLeNet)上的性能也取得显著提升.

Abstract:: Improving the convergence speed and prediction accuracy of neural network(NN)has always been a challenging problem in the field of computer artificial intelligence.Although many researchers have studied using small batches of data to train neural networks to obtain better results,these method are not enough flexible.Aiming at this problem,the Fibonacci sampling algorithm that is a new data preprocessing algorithm is proposed.A new training data sequence is drawn according to the Fibonacci sequence rules,which can not only recover the small batch data partition,but also provide more flexible batch size selection.Experiments show that before gradient descent,the Fibonacci data partition sequence can get good experimental results.In the same single CNN(five-layer convolutional neural network)and AlexNet,the Fibonacci sampling algorithm can obtain higher accuracy and lower loss than traditional mini-batch gradient descent algorithm,and it can be used in several general networks such as LeNet,AlexNet,VGG-16,GoogLeNet.

参考文献/References:

[1] Zhang Tong.Solving large scale linear prediction problems using stochastic gradient descent algorithms[EB/OL].[2020-08-11].https://dl.acm.org/doi/abs/10.1145/1015330.1015332.
[2] Rakhlin A,Shamir O,Sridharan K.Making gradient descent optimal for strongly convex stochastic optimization[EB/OL].[2020-08-11].https://arxiv.org/abs/1109.5647.
[3] Shamir O,Zhang Tong.Stochastic gradient descent for non-smooth optimization:convergence results and optimal averaging schemes[EB/OL].[2020-08-11].http://adsabs.harvard.edu/abs/2012arXiv1212.1824S.
[4] Duchi J,Singer Y.Efficient online and batch learning using forward backward splitting[J].Journal of Machine Learning Research,2009,10:2899-2934.
[5] Luo Zhiquan,Tseng P.On the convergence of the coordinate descent method for convex differentiable minimization[J].Journal of Optimization Theory and Applications,1992,72(1):7-35.
[6] Mangasarian O L,Musicant D R.Successive overrelaxation for support vector machines[J].IEEE Transactions on Neural Networks,1999,10(5):1032-1037.
[7] Hsieh C J,Chang Kaiwei,Lin C J,et al.A dual coordinate descent method for large-scale linear SVM[EB/OL].[2020-08-11].http://dl.acm.org/citation.cfm?id=1390208.
[8] Shalev-Shwartz S,Tewari A.Stochastic methods for l₁-regularized loss minimization[J].Journal of Machine Learning Research,2011,12:1865-1892.
[9] Lacoste-Julien S,Jaggi M,Schmidt M,et al.Stochastic block-coordinate frank-wolfe optimization for structural SVMs[EB/OL].[2020-08-11].http://arxiv.org/pdf/1207.4747v1.pdf.
[10] Nesterov Y.Efficiency of coordinate descent methods on huge-scale optimization problems[J].SIAM Journal on Optimization,2012,22(2):341-362.
[11] Shalev-Shwartz S,Zhang Tong.Proximal stochastic dual coordinate ascent[EB/OL].[2020-08-11].https://arxiv.org/abs/1211.2717.
[12] Shalev-Shwartz S,Zhang Tong.Stochastic dual coordinate ascent methods for regularized loss minimization[J].Journal of Machine Learning Research,2013,14:567-599.
[13] Zhao Peilin,Zhang Tong.Stochastic optimization with importance sampling for regularized loss minimization[EB/OL].[2020-08-11].https://dl.acm.org/doi/10.5555/3045118.3045120.
[14] 张林刚,严广乐,路晓伟.嵌套分割算法:一种新的并行随机优化算法[J].计算机应用研究,2007,24(6):79-81.
[15] 米永强,高岳林.求解约束优化问题的改进粒子群优化算法[J].江西师范大学学报:自然科学版,2015,39(1):59-63.
[16] 夏红卫,文传军.一般非线性约束优化问题的信赖域法[J].江西师范大学学报:自然科学版,2012,36(3):253-256.
[17] Robbins H,Monro S.A stochastic approximation method[J].The Annals of Mathematical Statistics,1951,22(3):400-407.
[18] Shalev-Shwartz S,Zhang Tong.Accelerated mini-batch stochastic dual coordinate ascent[EB/OL].[2020-08-11].https://dl.acm.org/doi/10.5555/2999611.2999654.
[19] Defazio A,Bach F,Lacoste-Julien S.SAGA:a fast incremental gradient method with support for non-strongly convex composite objectives[EB/OL].[2020-08-11].https://dl.acm.org/doi/10.5555/2968826.2969010.
[20] Li Mu,Zhang Tong,Chen Yuqiang,et al.Efficient mini-batch training for stochastic optimization[EB/OL].[2020-08-11].https://dl.acm.org/doi/10.1145/2623330.2623612.
[21] Luo Zhiquan,Tseng P.On the convergence of the coordinate descent method for convex differentiable minimization[J].Journal of Optimization Theory and Applications,1992,72(1):7-35.
[22] Tseng P.Convergence of a block coordinate descent method for nondifferentiable minimization[J].Journal of Optimization Theory and Applications,2001,109(3):475-494.
[23] Saha A,Tewari A.On the nonasymptotic convergence of cyclic coordinate descent methods[J].SIAM Journal on Optimization,2013,23(1):576-601.
[24] Wright S J.Coordinate descent algorithms[J].Mathematical Programming,2015,151(1):3-34.
[25] Gurbuzbalaban M,Ozdaglar A,Parrilo P A,et al.When cyclic coordinate descent outperforms randomized coordinate descent[EB/OL].[2020-08-11].http://mert.lids.mit.edu/w-content/uploads/sites/10/2017/11/CCDvsRCD.pdf.
[26] Nutini J,Schmidt M,Laradji I H,et al.Coordinate descent converges faster with the Gauss-Southwell rule than random selection[EB/OL].[2020-08-11].http://arxiv.org/abs/1506.00552.
[27] Zheng Qu,Richtárik P.Coordinate descent with arbitrary sampling I:algorithms and complexity[J].Optimization Methods and Software,2016,31(5):829-857.
[28] Stich S U,Raj A,Jaggi M.Approximate steepest coordinate descent[C]∥Precup D,Teh Y W.Proceedings of the 34th International Conference on Machine Learning,Sydney,Australia,Aug 6-11,2017.Sydney:PMLR,2017,70:3251-3259.
[29] Nesterov Y.Efficiency of coordinate descent methods on huge-scale optimization problems[J].SIAM Journal on Optimization,2012,22(2):341-362.
[30] Richtárik P,Takác^ˇ M.Distributed coordinate descent method for learning with big data[J].Journal of Machine Learning Research,2016,17:75.
[31] Richtárik P,Takác^ˇ M.On optimal probabilities in stochastic coordinate descent methods[J].Optimization Letters,2016,10(6):1233-1243.
[32] Richtárik P,Takác^ˇ M.Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function[J].Mathematical Programming,2014,144(1/2):1-38.
[33] Lin Qihang,Lu Zhaosong,Xiao Lin.An accelerated proximal coordinate gradient method[EB/OL].[2020-08-11].https://dl.acm.org/doi/10.5555/2969033.2969168.
[34] Byrd R H,Chin G M,Nocedal J,et al.Sample size selection in optimization methods for machine learning[J].Mathematical Programming,2012,134(1):127-155.
[35] 杨杰明,闫欣,曲朝阳,等.基于数据密度分布的欠采样方法研究[J].计算机应用研究,2016,33(10):2997-3000.
[36] Shapiro A,Homem-de-Mello T.On the rate of convergence of optimal solutions of Monte Carlo approximations of stochastic programs[J].SIAM Journal on Optimization,2000,11(1):70-86.
[37] Shapiro A,Wardi Y.Convergence analysis of stochastic algorithms[J].Mathematics of Operations Research,1996,21(3):615-628.
[38] Kleywegt A J,Shapiro A,Homem-de-Mello T.The sample average approximation method for stochastic discrete optimization[J].SIAM Journal on Optimization,2001,12(2):479-502.
[39] Shapiro A,Homem-de-Mello T.A simulation-based approach to two-stage stochastic programming with recourse[J].Mathematical Programming,1998,81(3):301-325.
[40] Homem-de-Mello T.Variable-sample methods for stochastic optimization[J].ACM Transactions on Modeling and Computer Simulation,2003,13(2):108-133.
[41] Bastin F,Cirillo C,Toint P L.An adaptive Monte Carlo algorithm for computing mixed logit estimators[J].Computational Management Sciences,2006,3(1):55-79.
[42] Sutskever I,Martens J,Dahl G,et al.On the importance of initialization and momentum in deep learning[EB/OL].[2020-08-11].https://dl.acm.org/doi/10.5555/3042817.3043064.
[43] Qian Ning.On the momentum term in gradient descent learning algorithms[J].Neural Networks,1999,12(1):145-151.
[44] Kingma D P,Ba J L.Adam:a method for stochastic optimization[EB/OL].[2020-08-11].https://pubmed.ncbi.nlm.nih.gov/12662723/.

备注/Memo

备注/Memo:: 收稿日期:2020-03-16
基金项目:国家自然科学基金(61363041),江西省自然科学基金(20141BBF60051),江西省教育厅科学技术研究(GJJ180234)和江西省教育厅科学技术研究(GJJ190208)资助项目.
作者简介:邓泓(1977-),男,江西都昌人,副教授,主要从事农业信息化与图像处理的研究.E-mail:jxaudh@aliyun.com
通信作者:彭莹琼(1978-),女,江西萍乡人,副教授,主要从事计算机视觉处理的研究.E-mail:29257353@qq.com

常用功能

工具/Tools

统计/Statistics

摘要浏览/Viewed2712
全文下载/Downloads1684
评论/Comments

更新日期/Last Update: 2021-04-10