[1]陈晨,王厚峰.中文跨文本人名同名同指消解研究[J].江西师范大学学报(自然科学版),2015,(02):111-116.
 CHEN Chen,WANG Houfeng.The Chinese Cross-Document Personal Coreference Resolution[J].,2015,(02):111-116.
点击复制

中文跨文本人名同名同指消解研究()
分享到:

《江西师范大学学报》(自然科学版)[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2015年02期
页码:
111-116
栏目:
出版日期:
2015-04-10

文章信息/Info

Title:
The Chinese Cross-Document Personal Coreference Resolution
作者:
陈晨;王厚峰
北京大学信息化建设与管理办公室,北京大学计算语言学教育部重点实验室,北京 100871
Author(s):
CHEN ChenWANG Houfeng
关键词:
人名同指消解 层次聚类 特征选择 停止条件
Keywords:
personal coreference resolution agglomerative clustering feature selection cluster-stopping measure
分类号:
TP 391
文献标志码:
A
摘要:
跨文本命名实体同指是指出现在多个文本中的相同名字指称相同对象.同指消解则是判断相同的名字是否指称相同对象的过程.跨文本同指消解对于多文本摘要和信息融合等具有重要作用.针对中文中最典型的命名实体——人名,研究了使用层次聚类方法在进行跨文本同指消解中的2个重要问题:特征选择和聚类停止条件判断.
Abstract:
Cross-document named entity coreference resolution is the process of determining if an identical name occurring in different texts refers to the same object.With the increasing need for multi-document applications,for example,multi-document summarization and information fusion,cross-document name entity coreference resolution has drawn much attention.The paper focuses on multi-document personal coreference resolution,and realizes an agglomerative clustering approach for personal coreference resolution,in which feature selection and stopping measures of the clustering to estimate the number of entities are discussed in detail.

参考文献/References:

[1] 赵军,刘康,周光有,等.开放式文本信息抽取 [J].中文信息学报,2011,25(6):98-110.
[2] Bagga Amit,Breck Baldwin.Entity-based cross-document coreferencing using the vector space model [C]∥ Proceedings of the 36 Annual Meeting of the ACL and the 17 International Conreference on Computational Linguistics(COLING-ACL),1998:79-85.
[3] Wang Houfeng,Zheng Mei.Chinese multidocument personal name disambiguation [J].High Technology Letters,2005,11(3):280-283.
[4] 陈晨,王厚峰.基于社会网络的跨文本同名消歧 [J].中文信息学报,2011,25(5):75-82.
[5] Javier Artiles,Julio Gonzalo,Satoshi Sekine.The SemEval-2007 WePS Evaluation:Establishing a benchmark for the Web People Search Task [C]∥Proceedings of the 4th International Workshop on Semantic Evaluations(Semeval-2007),2007:64-69.
[6] Heng Ji,Ralph Grishman,Hoa Trang Dang,et al.An overview of the TAC2010 knowledge base population rrack [C]∥Proceedings of Text Analytics Conference(TAC2010),2010.
[7] He Zhengyan,Wang Houfeng,Li Sujian.The task 2 of CIPS-SIGHAN 2012:named entity recognition and disambiguation in chinese bakeoff [C]∥Proceedings of the second CIPS-SIGHAN Joint Conference on Chinese Language Processing,2012:108-114.
[8] Chen Ying,James Martin.Towards robust unsupervised personal name disambiguation [C]∥Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning(EMNLP-CoNLL),2007:190-198.
[9] Suzanne Tamang,Chen Zheng,Ji Heng.CUNY_BLENDER TAC-KBP2012 entity linking system and slot filling validation system [C]∥Proceedings of Text Analytics Conference(TAC2012),2012.
[10] Chen Ying,Jin Peng,Li Wenjie,et al.The Chinese persons name diambiguation evaluation:exploration of personal name disambiguation in Chinese news [C]∥Proceedings of the first CIPS-SIGHAN Joint Conference on Chinese Language Processing,2010.
[11] Milligan G W,Coope M C.An examination of procedures for determining the number of clusters in a data set [J].Psychometrika,1985,50:159-179.
[12] Calinski R,Harabasz J.A dendrite method for cluster analysis [J].Communications in Statistics,1974(3):1-27.

备注/Memo

备注/Memo:
国家自然科学基金(61370117,61333018);国家社科重大课题(12&ZD227)
更新日期/Last Update: 1900-01-01