[1]刘立月,黄兆华,刘遵雄.高维数据分类中的特征降维研究[J].江西师范大学学报(自然科学版),2012,(02):131-134.
 LIU Li-yue,HUANG Zhao-hua,LIU Zun-xiong.The Research on Dimensionality Reduction for High-Dimensional Data Classification[J].Journal of Jiangxi Normal University:Natural Science Edition,2012,(02):131-134.
点击复制

高维数据分类中的特征降维研究()
分享到:

《江西师范大学学报》(自然科学版)[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2012年02期
页码:
131-134
栏目:
出版日期:
2012-03-01

文章信息/Info

Title:
The Research on Dimensionality Reduction for High-Dimensional Data Classification
作者:
刘立月;黄兆华;刘遵雄
华东交通大学软件学院,江西南昌,330013;华东交通大学信息工程学院,江西南昌,330013
Author(s):
LIU Li-yue;HUANG Zhao-hua;LIU Zun-xiong
关键词:
高维数据降维特征抽取稀疏正则化
Keywords:
high-dimensional data dimension reduction feature extraction sparse regularization
分类号:
TP181
文献标志码:
A
摘要:
以高维分类为目标,从分类的准确率与模型解释性角度探讨了降维的必要性,分析了特征选择与抽取2类方法特点,并对常用的特征抽取方法,包括主成分分析(PCA)、偏最小二乘(PLS)和非负矩阵分解(NMF)进行了阐述.考虑到约减后的数据缺乏稀疏性与可解释性,提出了基于稀疏正则化的特征抽取模型,为高维特征降维提供了一种新思路.
Abstract:
With the goal of high-dimensional classification, dimension reduction is discussed from the perspective of classification accuracy and model interpretation, and feature selection and feature extraction characterizes are analyzed. This paper introduces common feature extraction methods, including Principal Component Analysis, Partial Least Squares and Nonnegative Matrix Factorization. Considering the reduced data lacking in sparseness and interpretation, a sparse regularization based feature extraction framework has been proposed, and it provides dimensionality reduction in high-dimensional space a novel and available approach.

参考文献/References:

[1] Donoho D L. High-dimensional data analysis:the curses and blessings of dimensionality [EB/OL].
[2011-10-16].http://www- stat.stanford.edu/~donoho/Lectures/CBMS/Curses.pdf,2000.
[2] Fan Jianqing, Fan Yingying. High dimensional classification using features annealed independence rules [J]. Annals of Statistics, 2008, 36(6): 2605-2637.
[3] Siva Tian T. Dimensionality reduction for classification with high-dimensional data [D]. California: University of Southern California, 2009.
[4] 奉国和, 郑伟. 文本分类特征降维研究综述 [J]. 图书情报工作, 2011(9):.
[5] 陈涛, 谢阳群. 文本分类中的特征降维方法综述 [J]. 情报学报, 2005, 24(6): 690-695.
[6] Hastie T, Tibshirani R, Friedman J H. The elements of statistical learning: data mining,inference, and prediction [M]. 2nd ed . New York: Springer, 2009.
[7] 胡洁. 高维数据特征降维研究综述 [J]. 计算机应用研究, 2008, 25(9): 2601-2606.
[8] Breiman L. Heuristics of instability and stabilization in model selection [J]. Annals of Statistics, 1996, 24: 2350-2383.
[9] Jolliffe I T. Principal Component Analysis [M]. 2nd ed. New York: Springer, 2002.
[10] Shen H P, Huang Jianhua. Sparse principal component analysis via regularized low rank matrix approximation [J]. Journal of Multivariate Analysis, 2008, 99: 1015-1034.
[11] Abdi H. Partial least squares regression and projection on latent structure regression [J]. Wiley Interdisciplinary Reviews: Computational Statistics, 2010(2): 97-106.
[12] 高惠璇. 两个多重相关变量组的统计分析(3)(偏最小二乘回归与PLS过程) [J]. 数理统计与管理, 2002, 21(2): 58-64.
[13] Lee D D, Seung H S. Algorithms for non-negative matrix factorization [C]. Cambridge: MIT Press, 2000: 556-562.
[14] 李乐, 章毓晋. 非负矩阵分解算法综述 [J]. 电子学报, 2008, 36(4): 737-744.
[15] 王惠文, 张志慧, Tenenhaus M. 成分数据的多元回归建模方法研究 [J]. 管理科学学报, 2006, 9(4): 27-32.
[16] Fan Jianqing, Li Runze. Variable selection via nonconcave penalized likelihood and its oracle properties [J]. Journal of American Statistical Association, 2001, 96: 1348-1360.
[17] 王大荣, 张忠占. 线性回归模型中变量选择方法综述 [J]. 数理统计与管理, 2010, 29(4): 615-627.
[18] Efron B, Hastie T, Johnstone I, et al. Least angle regression [J]. Annals of Statistics, 2004, 32(2): 407-499.
[19] Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent [J]. Journal of Statistical Software, 2010, 33(1): 1-22.

相似文献/References:

[1]聂斌,王卓,杜建强,等.基于偏最小二乘法的信息粒降维及聚类研究[J].江西师范大学学报(自然科学版),2012,(05):472.
 NIE Bin,WANG Zhuo,DU Jian-qiang,et al.The Research for Information Granule Reduction and Cluster Based on the Partial Least Squares[J].Journal of Jiangxi Normal University:Natural Science Edition,2012,(02):472.

备注/Memo

备注/Memo:
国家自然科学基金(61065003;61165004);江西省教育厅科学技术研究(GJJ12308)
更新日期/Last Update: 1900-01-01