a tutorial on computer science

:: 管理 ::

21 随笔 :: 0 文章 :: 17 评论 :: 0 Trackbacks

stander random forest and ensemble of Extremely randomized trees

stander random forest: random K features, enum all values as split, find best split.

LINKS:https://en.wikipedia.org/wiki/Random_forest

Extremely randomized trees: random K features, random a split value, find best split.

ensemble Extremely randomized trees: use all data.

LINKS:http://docs.opencv.org/2.4/modules/ml/doc/ertrees.html

Extremely randomized trees don’t apply the bagging procedure to construct a set of the training samples for each tree. The same input training set is used to train all trees.
Extremely randomized trees pick a node split very extremely (both a variable index and variable splitting value are chosen randomly), whereas Random Forest finds the best split (optimal one by variable index and variable splitting value) among random subset of variables.

Extremely randomized trees用了所有的样本作为训练集；Extremely randomized trees随机选一个特征和一个值作为分割标准；

LINKS:http://scikit-learn.org/stable/modules/generated/sklearn.tree.ExtraTreeRegressor.html#sklearn.tree.ExtraTreeRegressor

This class implements a meta estimator that fits a number of randomized decision trees (a.k.a. extra-trees) on various sub-samples of the dataset and use averaging to improve the predictive accuracy and control over-fitting.

Extra-trees differ from classic decision trees in the way they are built. When looking for the best split to separate the samples of a node into two groups, random splits are drawn for each of the max_features randomly selected features and the best split among those is chosen. When max_features is set 1, this amounts to building a totally random decision tree.

extra-trees 的ensemble用了bagging，然后选取多个特征，每个特征随机选一个值作为分割标准建树。

一种实现方法：
样本bagging, random n features & random k values ，求最优，建树。

posted on 2016-02-28 21:01 bigrabbit 阅读(352) 评论(0) 编辑收藏引用

只有注册用户登录后才能发表评论。




网站导航: 博客园博客园最新博文博问管理

a tutorial on computer science

常用链接

留言簿(1)

随笔档案

friends

搜索

最新评论

阅读排行榜

评论排行榜