山东科技大学
本科毕业设计(论文)
题目大数据及数据挖掘方法
学院名称数学与系统科学学院
专业班级统计学10
学生姓名周广军
学号 201001051633
指导教师高井贵
二0一四年六月
大数据及数据挖掘方法
摘要
随着计算机技术的革新,互联网新媒体的快速发展,人们的生活已经进入高速信息时代。我们每天的生活都要产生大量数据,因此我们获取数据的速度和规模不断增长,大量数据不断的被存入存储介质中形成海量数据。海量数据的存储、应用及挖掘已成为人们研究的重要命题。
数据挖掘是从存放在数据库、数据仓库或者其他信息库中大量的不完全的有噪声的模糊的随机的数据中提取隐含在其中的人们事先未知、但潜在有用的信息和知识过程。表现形式为:规则、概念、规律及模式等。数据挖掘是一门广义的交叉学科,从一个新的角度把数据库技术、人工智能、统计学等领域结合起来,从更深层次发掘存在于数据内部新颖、有效、具有潜在效用的乃至最终可理解的模式。在数据挖掘中,数据分为训练数据、测试数据、和应用数据。数据挖掘的关键是在训练数据中发现事实,以测试数据作为检验和修正理论的依据,把知识应用到数据中去。
本文首先说明了大数据的概念及兴起与发展历程,然后介绍各种主流的数据分析挖掘方法。
关键词:大数据数据挖掘数据分析方法
Abstract
With the development puter technology, the rapid development of and new media, people's life has entered the information era. Our everyday life is to have a large amount of data, so we get the growing data speed and scale, a large amount of data have been stored in the form of mass data storage storage, application and mining massive data has e an important proposition that people study.
Data mining is stored in the database from the data warehouse, or other information in the library a lot of plete, noise fuzzy random data in which the extraction of implicit previously unknown, but potentially useful information and knowledge process. Manifestation: the rules, concepts, rules and patterns. Data mining is a crossed subject, database technology, artificial intelligence, statistics and other fields together to from a new point of view, from a more deep excavation in data within a novel, effective, with potentially useful and ultimately understandable patterns. In data mining, data is divided into training data, test data, and the application of data. The key to data mining is fact finding in the training data, the test data as test and modify the theory basis, the application of knowledge to the data.
This paper firstly illustrates the concept and the rise and development of large data, and then introduce various mainstream data mining method.
Keywords: large data data mining method o
大数据及数据挖掘方法 来自淘豆网m.daumloan.com转载请标明出处.