下载此文档

N的中文文本有害信息分类.pdf

文档分类：IT计算机 | 页数：约9页举报非法文档有奖

1/9

下载提示

1.该资料是网友上传的，本站提供全文预览，预览什么样，下载就什么样。
2.下载该文档所得收入归上传者、原创者。
3.下载的文档，不会出现我们的网址水印。

同意并开始全文预览

(约 1-6 秒)

1/9 下载此文档

文档列表 文档介绍

第卷第期集美大学学报（自然科学版）
25 5 1 2
CHEN Deyi , ＺHANG Hongyi , LIU Cailing , ＺHANG Guangbin
( 1. College of Optoeleｃtroniｃs and Communiｃation Engineering, Xiamen University of Teｃhnology, Xiamen 361024, China;
2. Xiamen Meiya Piｃo Information Co. , Ltd. , Xiamen 361005, China)
Ａｂｓｔｒａｃｔ：
The rapid development of internet and big data teｃhnology has greatly faｃilitated people's aｃｃess
to various Chinese text information, but also greatly inｃreased the risk of dissemination of harmful information
in Chinese text. The traditional text proｃessing method based on veｃtor representation is mainly used to proｃess
English text. To deal with these problems, a novel Chinese text ｃlassifiｃation framework was proposed. In this
framework, a word veｃtor model based on Word2Veｃ was ｃonstruｃted firstly. Then the keywords with distinguis-
hing ｃategory ability were seleｃted by using word doｃument frequenｃy ( segmentation term frequenｃy-doｃument
frequenｃy, STF-DF) . Meanwhile, a suitable ｃonvolution neural network ( CNN) was build for Chinese text ｃlas-
sifiｃation. The experimental results show that the aｃｃuraｃy of this framework in THUCNews and Fudan Univer-
sity Chinese text data set is 94. 51％ and 95. 04％ respeｃtively, an

N的中文文本有害信息分类来自淘豆网m.daumloan.com转载请标明出处.