毕业论文﹙设计﹚
题目基于投影数据挖掘算法研究与实现
学生姓名郭凯学号 041842020
所在院(系) 数学系
专业班级信息与计算科学043班
指导教师周涛
完成地点数学系数据挖掘实验室
2008年 6 月 9 日
基于投影数据挖掘算法研究与实现
[摘要]:序列模式的发现是数据挖掘领域一个活跃的研究分支,,然后详细描述FreeSpan和PrefixSpan2个基于投影、分治的模式增长的重要算法。基于投影方法即序列数据库先被投影为很多小投影数据库, ,再每个子空间里进行递归的的投影,对于每一个项及其与前一项组合成的序列模式进行投影挖掘,最终得出频繁子序列。PrefixSpan算法则是先找出长度为1的序列模式,以此序列模式为前缀的投影,并在投影数据库里面继续递归的进行投影,最终得出频繁子序列。本文并以实例解析,更为详细清楚的描述了两种算法的过程。
[关键词]:数据挖掘; FreeSpan算法;PrefixSpan算法;
According to cast shadow a data to
scoop out calculate way research
Author:GuoKai
(Grade04,Class03, Information and calculation science,Department of Mathematics,Shaanxi University of Technology,Hanzhong 723000,Shaanxi)
tutor:: ZhouTao
Abstract Sequence mode data mining is the discovery of an active area of research branch, that is, all sequences in the database to identify the frequency of sequence.
In this paper, first introduced in the sequence pattern mining some of the basic concepts, and then described in detail FreeSpan and PrefixSpan2 based projection, the partition of the important growth pattern algorithm. Based on the projection method that sequence database was first projection for the many small projection database, and then a small projection database Mining typical recursive algorithm. Which FreeSpan algorithm is divided into several sub-database space, then each of the recursive space for the projector, and for each and every item with bination of 10% of the former model projection excavation sequence, the final sequence of drawn frequent. The PrefixSpan calculate way then find out the length as one sequence mode first, take this sequence mode as cast shadow of ex- Zhui, and continue to pass to return in the projection the database of carry on cast shadow, end get multifarious sub- sequence. Analysis and examples in this paper, a more detailed description of the two clearly algorithm process.
Keywords The data scoop out;FreeSpan arithmet
基于投影数据挖掘算法研究与实现 来自淘豆网m.daumloan.com转载请标明出处.