?收稿日期:定稿日期:基金项目:国家社科基金一般项目(16BYY137)、国家重点基础研究发展计划资助项目(2014CB340504)、国家社科基金重大项目(12&ZD119)1文章编号:1003-0077(2011)00-0000-00基于语义构词的汉语词语语义相似度计算*康司辰1,刘扬2(,北京100871;2. 北京大学计算语言学研究所,北京100871)摘要:汉语词语语义相似度计算,在中文信息处理的多种应用中扮演至关重要的角色。基于汉语字本位的思想,我们采用词类、构词结构、语素义等汉语语义构词知识,以“语素概念”为基础,计算汉语词语语义相似度。这种词义知识表示简单、直观、易于拓展,计算模型简洁、易懂,采用了尽可能少的特征和参数。实验表明,本文方法在典型“取样词对”上的表现突出,其数值更符合人类的感性认知,且在全局数据上也表现出了合理的分布规律。关键词:词语语义相似度计算语义构词词义知识表示语素概念中图分类号:TP391 文献标识码:ASemantic Word-formation BasedChinese Word Similarity ComputingKang Sichen1, Liu Yang2( of Chinese Language and Literature, Peking University, Beijing 100871;2. Institute putational Linguistics, Peking University, Beijing 100871)Abstract: puting plays an important role in the application of Chinese information processing. Based on the notionof character-orientation,Chinese semantic word-formation knowledge, including word POS, word-oncepts, is employed puteChinese wordsimilarity. Thislexical knowledgerepresentation is simple, intuitive and easy to expand and the model is straight-forward, with characteristics and parameters adoptedas less as possible. Experimental results show that the approachis promisingforthe typical samplingword pair. Also, thenumerical valuesof similarity aremore in line with humancognitionand presenta reasonable distribution of the global words:Chinese word puting; Chinese semantic wor
基于语义构词的汉语词语语义相似度计算 来自淘豆网m.daumloan.com转载请标明出处.