用R实现随机森林的分类与回归
第五届中国R语言会议北京2012 李欣海
用R实现随机森林的分类与回归
Applications of Random Forest using R
Classification and Regression
李欣海
中科院动物所
邮件:lixh@//0>.
主页:////.
博客:////.
微博:////. 第五届中国R语言会议北京2012 李欣海
随机森林简介
Random Forest
////.
an-introduction-to-data-mining-for-marketing-and-business-intelligence/Random Forest is an ensemble classifier that
consists of many decision trees It outputs the class that is the mode of the class's
output by individual trees Breiman 2001 It deals with “small n large p”-problems, high-order
interactions, correlated predictor variables.
Breiman, L. 2001. Random forests. Machine Learning 45:5-32. Being cited 6500 times until 20123/25 第五届中国R语言会议北京2012 李欣海
随机森林简介
History
////.
an-introduction-to-data-mining-for-marketing-and-business-intelligence/
The algorithm for inducing a random forest was developed by
Leo Breiman 2001 and Adele Cutler, and "Random Forests" is
their trademarkThe term came from random decision forests that was first
proposed by Tin Kam Ho of Bell Labs in 1995The bines Breiman's "bagging" idea and the
random selection of features, introduced independently by Ho
1995 and Amit and Geman 1997 in order to construct a
collection of decision trees with controlled 第五届中国R语言会议北京2012 李欣海
随机森林简介
Tree models
y β+ β x + β x + β x + ε
i 0 1 1i 2 2 i 3 3i i
Classification tree
Regression tree
Crawley 2007 The R Book p691 Crawley 2007 The R Book p6945/25 第五届中国R语言会议北京2012 李欣海
随机森林简介
The munity uses irrelevant theory,
questionable conclusions?
David R. Cox Emanuel Parzen Bruce Hoadley
Brad Efron
NO YES6/25 第五届中国R语言会议北京2012 李欣海
随机森林简介
Ensemble classifiers
////.
Tree models are simple, often produce noisy bushy or weak
stunted classifiers Bagging Breiman, 1996: Fit many large trees to bootstrap-
resampled versions of the training data, and classify by majority vote Boosting Freund & Shapire, 1996: Fit many large or small trees to
reweighted versions of the training data. Classif
用r实现随机森林的分类与回归 来自淘豆网m.daumloan.com转载请标明出处.