下载此文档

N的中文文本有害信息分类.pdf


文档分类:IT计算机 | 页数:约9页 举报非法文档有奖
1/9
下载提示
  • 1.该资料是网友上传的,本站提供全文预览,预览什么样,下载就什么样。
  • 2.下载该文档所得收入归上传者、原创者。
  • 3.下载的文档,不会出现我们的网址水印。
1/9 下载此文档
文档列表 文档介绍
第 卷 第 期 集美大学学报 (自然科学版)
25 5 1 2
CHEN Deyi , ZHANG Hongyi , LIU Cailing , ZHANG Guangbin
( 1. College of Optoelectronics and Communication Engineering, Xiamen University of Technology, Xiamen 361024, China;
2. Xiamen Meiya Pico Information Co. , Ltd. , Xiamen 361005, China)
Abstract:
The rapid development of internet and big data technology has greatly facilitated people's access
to various Chinese text information, but also greatly increased the risk of dissemination of harmful information
in Chinese text. The traditional text processing method based on vector representation is mainly used to process
English text. To deal with these problems, a novel Chinese text classification framework was proposed. In this
framework, a word vector model based on Word2Vec was constructed firstly. Then the keywords with distinguis-
hing category ability were selected by using word document frequency ( segmentation term frequency-document
frequency, STF-DF) . Meanwhile, a suitable convolution neural network ( CNN) was build for Chinese text clas-
sification. The experimental results show that the accuracy of this framework in THUCNews and Fudan Univer-
sity Chinese text data set is 94. 51% and 95. 04% respectively, an

N的中文文本有害信息分类 来自淘豆网m.daumloan.com转载请标明出处.

相关文档 更多>>
非法内容举报中心
文档信息
  • 页数9
  • 收藏数0 收藏
  • 顶次数0
  • 上传人学习的一点
  • 文件大小1.65 MB
  • 时间2022-02-12