﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>C++博客-bookheart</title><link>http://www.cppblog.com/bookheart/</link><description /><language>zh-cn</language><lastBuildDate>Tue, 14 Apr 2026 23:08:09 GMT</lastBuildDate><pubDate>Tue, 14 Apr 2026 23:08:09 GMT</pubDate><ttl>60</ttl><item><title>使用autotool在linux编译c++工程</title><link>http://www.cppblog.com/bookheart/archive/2013/04/01/199014.html</link><dc:creator>菜鸟想学飞</dc:creator><author>菜鸟想学飞</author><pubDate>Mon, 01 Apr 2013 07:53:00 GMT</pubDate><guid>http://www.cppblog.com/bookheart/archive/2013/04/01/199014.html</guid><wfw:comment>http://www.cppblog.com/bookheart/comments/199014.html</wfw:comment><comments>http://www.cppblog.com/bookheart/archive/2013/04/01/199014.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/bookheart/comments/commentRss/199014.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/bookheart/services/trackbacks/199014.html</trackback:ping><description><![CDATA[1.建立一个目录.<br />&nbsp;mkidr test<br />2.建立一个测试文件<br />vi main.cpp<br /><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />-->&nbsp;&nbsp;1&nbsp;#include&nbsp;&lt;iostream&gt;<br />&nbsp;&nbsp;2&nbsp;<br />&nbsp;&nbsp;3&nbsp;<span style="color: #0000FF; ">int</span>&nbsp;main(<span style="color: #0000FF; ">int</span>&nbsp;argc,&nbsp;<span style="color: #0000FF; ">char</span>*&nbsp;agrv[])<br />&nbsp;&nbsp;4&nbsp;{<br />&nbsp;&nbsp;5&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;std::cout&nbsp;&lt;&lt;&nbsp;"hello!"&nbsp;&lt;&lt;&nbsp;std::endl;<br />&nbsp;&nbsp;6&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">return</span>&nbsp;0;<br />&nbsp;&nbsp;7&nbsp;}</div>3.创建configure.ac文件<br />autoscan<br />提示configure.ac不存在，但是会生成一个configure.scan<br />mv configure.scan configure.ac<br />vi&nbsp;configure.ac修改为如下<br /><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />-->&nbsp;&nbsp;1&nbsp;#&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;-*-&nbsp;Autoconf&nbsp;-*-<br />&nbsp;&nbsp;2&nbsp;#&nbsp;Process&nbsp;<span style="color: #0000FF; ">this</span>&nbsp;file&nbsp;with&nbsp;autoconf&nbsp;to&nbsp;produce&nbsp;a&nbsp;configure&nbsp;script.<br />&nbsp;&nbsp;3&nbsp;<br />&nbsp;&nbsp;4&nbsp;AC_PREREQ(2.59)<br />&nbsp;&nbsp;5&nbsp;AC_INIT([loki],&nbsp;[1.0],&nbsp;[])<br />&nbsp;&nbsp;6&nbsp;AM_INIT_AUTOMAKE<br />&nbsp;&nbsp;7&nbsp;AC_CONFIG_SRCDIR([main.cpp])<br />&nbsp;&nbsp;8&nbsp;AC_CONFIG_HEADER([config.h])<br />&nbsp;&nbsp;9&nbsp;AC_CONFIG_FILES([Makefile])<br />&nbsp;10&nbsp;<br />&nbsp;11&nbsp;#&nbsp;Checks&nbsp;<span style="color: #0000FF; ">for</span>&nbsp;programs.<br />&nbsp;12&nbsp;AC_PROG_CXX<br />&nbsp;13&nbsp;<br />&nbsp;14&nbsp;#&nbsp;Checks&nbsp;<span style="color: #0000FF; ">for</span>&nbsp;libraries.<br />&nbsp;15&nbsp;<br />&nbsp;16&nbsp;#&nbsp;Checks&nbsp;<span style="color: #0000FF; ">for</span>&nbsp;header&nbsp;files.<br />&nbsp;17&nbsp;<br />&nbsp;18&nbsp;#&nbsp;Checks&nbsp;<span style="color: #0000FF; ">for</span>&nbsp;typedefs,&nbsp;structures,&nbsp;and&nbsp;compiler&nbsp;characteristics.<br />&nbsp;19&nbsp;<br />&nbsp;20&nbsp;#&nbsp;Checks&nbsp;<span style="color: #0000FF; ">for</span>&nbsp;library&nbsp;functions.<br />&nbsp;21&nbsp;AC_OUTPUT</div><div><br />4.创建相关文件<br />touch AUTHORS ChangeLog NEWS README<br /><br />5.编写Makefile.am<br />vi Makefile.am<br /><div style="background-color: #eeeeee; font-size: 13px; border: 1px solid #cccccc; padding: 4px 5px 4px 4px; width: 98%; word-break: break-all;"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />-->1&nbsp;noinst_PROGRAMS&nbsp;=&nbsp;loki<br />2&nbsp;loki_SOURCES&nbsp;=&nbsp;main.cpp</div><br />6.<br />autoreconf -i<br /><br />7.<br />./configure<br /><br />8. 配置完工，使用make即可看到编译出的执行文件loki<br />make<br />./loki<br /><br />要了解具体为什么要这样做，请看这篇教程http://www.freesoftwaremagazine.com/books/autotools_a_guide_to_autoconf_automake_libtool<div><div></div></div></div><img src ="http://www.cppblog.com/bookheart/aggbug/199014.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/bookheart/" target="_blank">菜鸟想学飞</a> 2013-04-01 15:53 <a href="http://www.cppblog.com/bookheart/archive/2013/04/01/199014.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>行为克隆创建代理</title><link>http://www.cppblog.com/bookheart/archive/2012/05/20/161869.html</link><dc:creator>菜鸟想学飞</dc:creator><author>菜鸟想学飞</author><pubDate>Sun, 20 May 2012 05:29:00 GMT</pubDate><guid>http://www.cppblog.com/bookheart/archive/2012/05/20/161869.html</guid><wfw:comment>http://www.cppblog.com/bookheart/comments/161869.html</wfw:comment><comments>http://www.cppblog.com/bookheart/archive/2012/05/20/161869.html#Feedback</comments><slash:comments>4</slash:comments><wfw:commentRss>http://www.cppblog.com/bookheart/comments/commentRss/161869.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/bookheart/services/trackbacks/161869.html</trackback:ping><description><![CDATA[<div><span style="font-size: 14px; line-height: 23px; "><p><font class="Apple-style-span" face="宋体">这个<span class="Apple-style-span" style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">《精粹</span>7<span style="font-family: 宋体; ">》里面一个AI设计的例子，方式是用行为克隆创建代理。《</span></span></font><font class="Apple-style-span" face="宋体">精粹</font><font class="Apple-style-span" face="'lucida Grande', Verdana">7</font><span style="font-family: 宋体; ">》上的描述实在是太粗略，很多东西都没提及。花了一星期的闲碎时间去看这个例子，感觉挺有收获的，整理了一下学习心得，希望能给跟我一样的菜鸟们带来一点帮助。这个例子代码不多，但是水挺深的。涉及了</span><font class="Apple-style-span" face="'lucida Grande', Verdana">lua,&nbsp;</font><span style="font-family: 宋体; ">仿函数，决策树，信息论</span><font class="Apple-style-span" face="'lucida Grande', Verdana">,&nbsp;</font><span style="font-family: 宋体; ">大堆循环递归等情况。</span><span class="Apple-style-span" style="font-family: 'lucida Grande', Verdana; ">&nbsp;</span></p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">这个</span>DEMO<span style="font-family: 宋体; ">实现了一个飞机对战射击小游戏，电脑控制那个飞机</span><span style="font-family: 宋体; ">会学习你的行为进行模仿。</span></p><h1 style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">项目运行</span></h1><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">首先，先把项目跑起来再说。我的</span>IDE<span style="font-family: 宋体; ">是</span>VS2008</p><p style="font-family: 'lucida Grande', Verdana; ">1.&nbsp;<span style="font-family: 宋体; ">配置好</span>openGL<span style="font-family: 宋体; ">和</span>&nbsp;lua<span style="font-family: 宋体; ">环境。这里不细说了，请参考网上资料</span></p><p style="font-family: 'lucida Grande', Verdana; ">2. 配置AI。<span style="font-family: 宋体; ">打开项目工程，设置</span>AIShoote<span style="font-family: 宋体; ">为启动项，在</span><a name="OLE_LINK1" style="outline-style: none; outline-width: initial; outline-color: initial; text-decoration: none; cursor: pointer; color: #4e5d80; ">AIShoote</a>r<span style="font-family: 宋体; ">工程</span>---<span style="font-family: 宋体; ">属性</span>---<span style="font-family: 宋体; ">配置属性</span>---<span style="font-family: 宋体; ">调试</span>---<span style="font-family: 宋体; ">命令参数中填上</span>&nbsp;agent.lua<span style="font-family: 宋体; ">。</span>(<span style="font-family: 宋体; ">这个是默认的决策树的脚本</span>),当然了这个文件要放置在你的项目工程里。</p><p style="font-family: 'lucida Grande', Verdana; ">3. <span style="font-family: 宋体; ">编译，运行。控制你的飞机跟对手打一架吧，经过数分钟的搏斗，你终于战胜了机器人。关掉窗口，后台会生成一个数据统计表叫</span>training_data_xxx.dat<span style="font-family: 宋体; ">文件。这个东东是用来后面生成决策树脚本用的。</span></p><p style="font-family: 'lucida Grande', Verdana; ">4.&nbsp;<span style="font-family: 宋体; ">现在开始进行生成决策树，这课所谓的树就存放一个lua文件中。设置</span>Learn<span style="font-family: 宋体; ">为启动项</span>,&nbsp;<span style="font-family: 宋体; ">在</span>Learn<span style="font-family: 宋体; ">工程</span>---<span style="font-family: 宋体; ">属性</span>---<span style="font-family: 宋体; ">配置属性</span>---<span style="font-family: 宋体; ">调试</span>---<span style="font-family: 宋体; ">命令参数中填上:<br /></span>-s 1&nbsp;<a name="OLE_LINK2" style="outline-style: none; outline-width: initial; outline-color: initial; text-decoration: none; cursor: pointer; color: #4e5d80; ">training_data_xxx.dat</a>&nbsp;2.lua, 说明： -s&nbsp;<span style="font-family: 宋体; ">是跳过次数，</span>&nbsp;dat<span style="font-family: 宋体; ">文件是上一步骤产出的数据，</span>2.lua<span style="font-family: 宋体; ">是lua文件名。设置好了运行，就等吧，等待的时间要几分钟。然后提示你lua文件生成完毕。</span></p><p><font class="Apple-style-span" face="'lucida Grande', Verdana">5.&nbsp;</font><span style="font-family: 宋体; ">回到</span><font class="Apple-style-span" face="'lucida Grande', Verdana">步骤2</font><span style="font-family: 宋体; ">，把参数改成</span><font class="Apple-style-span" face="'lucida Grande', Verdana">2.lua</font><span style="font-family: 宋体; ">，再编译，运行，这时候,仔细观察，是不是发现电脑控制的飞机行为跟你操控很像呀。嘻嘻，有点意思吧。</span></p><h1 style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">功能分析</span></h1><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">各个模块的功能</span></p><p style="font-family: 'lucida Grande', Verdana; ">AIShooter<span style="font-family: 宋体; ">：用</span>openGL<span style="font-family: 宋体; ">渲染战斗场景，玩家键盘输入检测</span>,&nbsp;<span style="font-family: 宋体; ">使用决策树脚本控制机器人行为</span>,&nbsp;<span style="font-family: 宋体; ">记录玩家行为数据。</span></p><p style="font-family: 'lucida Grande', Verdana; ">DecisionTree<span style="font-family: 宋体; ">：</span>&nbsp;<span style="font-family: 宋体; ">核心算法，根据玩家行为数据生成决策树。</span></p><p style="font-family: 'lucida Grande', Verdana; ">Learn:&nbsp;<span style="font-family: 宋体; ">使用</span>DecisionTree<span style="font-family: 宋体; ">提供的接口函数生成决策树</span>,<span style="font-family: 宋体; ">并把决策树生成脚本代码。</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><h1 style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">决策树的使用和实现</span></h1><h2 style="font-family: 'lucida Grande', Verdana; ">lua&nbsp;<span style="font-family: 黑体; ">与</span>&nbsp;c++<span style="font-family: 黑体; ">交互</span>:</h2><p align="left" style="text-align: left; font-family: 'lucida Grande', Verdana; ">C++<span style="font-family: 宋体; ">创建</span>lua<span style="font-family: 宋体; ">的表对象，并传递类指针和函数指针给它，这样</span>lua<span style="font-family: 宋体; ">就可以调用到</span>c++<span style="font-family: 宋体; ">的函数</span></p><p style="font-family: 'lucida Grande', Verdana; ">Agent<span style="font-family: 宋体; ">类在</span>init()<span style="font-family: 宋体; ">中</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;&nbsp;lua_newtable (mLua);//</span><span style="font-family: 宋体; ">创建表对象</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;&nbsp;lua_pushlightuserdata (mLua, this);//&nbsp;</span><span style="font-family: 宋体; ">将一个代表</span>C<span style="font-family: 宋体; ">指针的值放到栈</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;&nbsp;lua_pushcclosure (mLua, Agent::setup_addInterval, 1);//&nbsp;</span><span style="font-family: 宋体; ">把一个新的</span>&nbsp;C closure&nbsp;<span style="font-family: 宋体; ">压入堆栈。</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;&nbsp;lua_setfield (mLua, -2, "addInterval");</span><span style="font-family: 宋体; ">以</span>addInterval<span style="font-family: 宋体; ">为键值注册到表中</span></p><p style="text-indent: 21.75pt; font-family: 'lucida Grande', Verdana; ">lua_setglobal (mLua, "AIShooter");//<span style="font-family: 宋体; ">设置表名</span></p><p style="text-indent: 21.75pt; font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">1.<span style="font-family: 宋体; ">创建</span>lua<span style="font-family: 宋体; ">表</span>AIShootert&nbsp;<span style="font-family: 宋体; ">设置项</span>addInterval&lt;-&gt;setup_addInterval.</p><p style="font-family: 'lucida Grande', Verdana; ">2.<span style="font-family: 宋体; ">创建</span>lua<span style="font-family: 宋体; ">表对象</span>Agent,&nbsp;<span style="font-family: 宋体; ">设置项</span>accel&lt;-&gt;agent_accel turn&lt;-&gt;agent_turn fire&lt;-&gt;agent_fire</p><p style="font-family: 'lucida Grande', Verdana; ">3.<span style="font-family: 宋体; ">创建</span>lua<span style="font-family: 宋体; ">表对象</span>GameState&nbsp;<span style="font-family: 宋体; ">设置项在</span>act()<span style="font-family: 宋体; ">中动态设置</span></p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">处理过程</span>:</p><p style="margin-left: 18pt; text-indent: -18pt; font-family: 'lucida Grande', Verdana; "><span>1.<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span></span><span style="font-family: 宋体; ">一开始调用</span>lua<span style="font-family: 宋体; ">函数</span>setupFeatureMap&nbsp;<span style="font-family: 宋体; ">读取数据，然后通过</span>&nbsp;AIShooter.addInterval<span style="font-family: 宋体; ">回调</span></p><p style="margin-left: 18pt; text-indent: -18pt; font-family: 'lucida Grande', Verdana; "><span>1.</span></p><p style="font-family: 'lucida Grande', Verdana; ">C++<span style="font-family: 宋体; ">函数</span>setup_addInterval<span style="font-family: 宋体; ">进行数据传入</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">2.&nbsp;<span style="font-family: 宋体; ">游戏中的</span>act()<span style="font-family: 宋体; ">，通过对表</span>GameState<span style="font-family: 宋体; ">实时传入数据，然后调用</span>lua<span style="font-family: 宋体; ">的</span>accelTree<span style="font-family: 宋体; ">根据</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">GameState<span style="font-family: 宋体; ">的数据进行逻辑处理</span>,<span style="font-family: 宋体; ">处理结果通过调用</span>Agent.accel<span style="font-family: 宋体; ">以参数形式传到</span>C++<span style="font-family: 宋体; ">函数</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">agent_accel<span style="font-family: 宋体; ">中，</span>&nbsp;<span style="font-family: 宋体; ">另外两个也类似</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">决策树的使用：</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;</span><span style="font-family: 宋体; ">初始化的时候，通过这样划定了每个行为的特征范围以及对应特征值</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;AIShooter.addInterval ("opp_distance", 0, 0.0, 0.05);</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;</span><span style="font-family: 宋体; ">这些数据存放在</span>mFeatureMap<span style="font-family: 宋体; ">中</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;</span><span style="font-family: 宋体; ">在</span>updata<span style="font-family: 宋体; ">中实时传回当前状态数据，</span>mFeatureMap<span style="font-family: 宋体; ">根据范围来取他们对应的特征值</span>,<span style="font-family: 宋体; ">这些特征设置到</span>GameState<span style="font-family: 宋体; ">中</span>,<span style="font-family: 宋体; ">然后再根据脚本中的决策树进行逻辑判断。结果再返回</span>c++<span style="font-family: 宋体; ">相关的接口中</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">// Game state<span>&nbsp; &nbsp;</span></p><p style="font-family: 'lucida Grande', Verdana; ">std::queue&lt;DataSet::raw_row_t&gt; mStateQueue;</p><p style="font-family: 'lucida Grande', Verdana; ">typedef std::map&lt;std::string, float&gt; raw_row_t</p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">// Training state</p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp; &nbsp;</span></p><p style="font-family: 'lucida Grande', Verdana; ">FeatureMap mFeatureMap; //<span style="font-family: 宋体; ">数据管理类</span></p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">类中类设计</span></p><p style="font-family: 'lucida Grande', Verdana; ">Feature&nbsp;<span style="font-family: 宋体; ">基类</span></p><p style="font-family: 'lucida Grande', Verdana; ">NominalFeature</p><p style="font-family: 'lucida Grande', Verdana; ">ContinuousFeature&nbsp;<span style="font-family: 宋体; ">派生类</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">std::map &lt;std::string, Feature*&gt; mFeatures;</p><p style="font-family: 'lucida Grande', Verdana; ">FeatureMap<span style="font-family: 宋体; ">提供对外的接口再具体调具体的派生类</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">ContinuousFeature</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">有一个数据类型</span></p><p style="font-family: 'lucida Grande', Verdana; ">typedef std::map&lt;int, std::pair&lt;float, float&gt; &gt; interval_t</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">所以的操作都是对其进行</span></p><p style="font-family: 'lucida Grande', Verdana; ">1.<span style="font-family: 宋体; ">添加数据</span></p><p style="font-family: 'lucida Grande', Verdana; ">2.<span style="font-family: 宋体; ">返回查找的值</span></p><p style="font-family: 'lucida Grande', Verdana; ">3.<span style="font-family: 宋体; ">保存文件和读取文件</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">DataSet mTrainingData;&nbsp;<span style="font-family: 宋体; ">仿函数的实战教程</span></p><p style="font-family: 'lucida Grande', Verdana; ">for_each&nbsp;<span style="font-family: 宋体; ">和</span>&nbsp;transform</p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">仿函数的关键</span></p><p style="font-family: 'lucida Grande', Verdana; ">template&lt; typename _t &gt;</p><p style="font-family: 'lucida Grande', Verdana; ">class cfun</p><p style="font-family: 'lucida Grande', Verdana; ">{</p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;operator () ( _t value )</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;{</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;do some thing&nbsp;</span><span style="font-family: 宋体; ">给遍历用的</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;}</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;operator int() const&nbsp;</span><span style="font-family: 宋体; ">这个是返回整形，其他类型根据需要定</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;{</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;</span><span style="font-family: 宋体; ">给返回值用</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;}</span></p><p style="font-family: 'lucida Grande', Verdana; ">}</p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">数学</span>:<span>&nbsp;&nbsp;log2(N)&nbsp;</span><span style="font-family: 宋体; ">在代码中这样获取</span>&nbsp;log10(N)/log10(2)</p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">决策树的生成</span></p><p style="font-family: 'lucida Grande', Verdana; ">ID3<span style="font-family: 宋体; ">算法：从一系列统计数据中得到关键属性，并进行分类。重要概念是熵和信息增益</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p align="left" style="margin-bottom: 9pt; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: 宋体; color: #333333; ">从</span><span style="line-height: 25px; font-family: Georgia; color: #333333; "><a href="http://en.wikipedia.org/wiki/Information_theory" target="_blank" style="outline-style: none; outline-width: initial; outline-color: initial; text-decoration: underline; cursor: pointer; color: #4e5d80; "><span style="font-family: 宋体; color: #3d81ee; text-decoration: none; ">信息论</span></a></span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">知识中我们直到，期望信息越小，</span><span style="line-height: 25px; font-family: Georgia; color: #333333; "><a href="http://en.wikipedia.org/wiki/Information_gain" target="_blank" style="outline-style: none; outline-width: initial; outline-color: initial; text-decoration: underline; cursor: pointer; color: #4e5d80; "><span style="font-family: 宋体; color: #3d81ee; text-decoration: none; ">信息增益</span></a></span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">越大，从而纯度越高。所以</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">ID3</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">算法的核心思想就是以信息增益度量属性选择，选择分裂后信息增益最大的属性进行分裂。下面先定义几个要用到的概念。</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">设</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">D</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">为用类别对训练元组进行的划分，则</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">D</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">的</span><span style="line-height: 25px; font-family: Georgia; color: #333333; "><a href="http://en.wikipedia.org/wiki/Entropy" target="_blank" style="outline-style: none; outline-width: initial; outline-color: initial; text-decoration: underline; cursor: pointer; color: #4e5d80; "><span style="font-family: 宋体; color: #3d81ee; text-decoration: none; ">熵</span></a></span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">（</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">entropy</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">）表示为：</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp; &nbsp;&nbsp;</span><span class="Apple-style-span" style="color: #333333; font-family: Georgia, 'Times New Roman', Times, san-serif; "><img title="info(D)=-\sum ^m_{i=1}p_ilog_2(p_i)" src="http://latex.codecogs.com/gif.latex?info(D)=-\sum%20^m_{i=1}p_ilog_2(p_i)" alt="" style="border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; " /></span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">其中</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">pi</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">表示第</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">i</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">个类别在整个训练元组中出现的概率，可以用属于此类别元素的数量除以训练元组元素总数量作为估计。熵的实际意义表示是</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">D</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">中元组的类标号所需要的平均信息量。</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">现在我们假设将训练元组</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">D</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">按属性</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">A</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">进行划分，则</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">A</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">对</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">D</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">划分的期望信息为：</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp; &nbsp;&nbsp;</span><span class="Apple-style-span" style="color: #333333; font-family: Georgia, 'Times New Roman', Times, san-serif; "><img title="info_A(D)=\sum ^v_{j=1}\frac{|D_j|}{|D|}info(D_j)" src="http://latex.codecogs.com/gif.latex?info_A(D)=\sum%20^v_{j=1}\frac{|D_j|}{|D|}info(D_j)" alt="" style="border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; " /></span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">而信息增益即为两者的差值：</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp; &nbsp;&nbsp;</span><span class="Apple-style-span" style="color: #333333; font-family: Georgia, 'Times New Roman', Times, san-serif; "><img title="gain(A)=info(D)-info_A(D)" src="http://latex.codecogs.com/gif.latex?gain(A)=info(D)-info_A(D)" alt="" style="border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; " /></span></p><p align="left" style="margin-top: 9pt; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp; ID3</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">算法就是在每次需要分裂时，计算每个属性的增益率，然后选择增益率最大的属性进行分裂</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">。下面我们继续用</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">SNS</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">社区中不真实账号检测的例子说明如何使用</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">ID3</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">算法构造决策树。为了简单起见，我们假设训练集合包含</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">10</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">个元素：</span></p><p align="left" style="margin-top: 9pt; text-align: left; line-height: 25px; word-break: break-all; "><font class="Apple-style-span" color="#333333" face="Georgia"></font></p><div><font class="Apple-style-span" color="#333333" face="Georgia"><img src="http://images.cnblogs.com/cnblogs_com/leoo2sk/WindowsLiveWriter/34d255f282ae_B984/2_3.png"  alt="" /></font></div><p>&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p align="left" style="margin-bottom: 9pt; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: 宋体; color: #333333; ">其中</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">s</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">、</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">m</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">和</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">l</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">分别表示小、中和大。</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">设</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">L</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">、</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">F</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">、</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">H</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">和</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">R</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">表示日志密度、好友密度、是否使用真实头像和账号是否真实，下面计算各属性的信息增益。</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">这里是计算结果的熵，共有</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">10</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">个结果</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">, 3</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">个是</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">no, 7</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">个</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">yes,&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">如果结果不只是（</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">yes</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">和</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">no</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">两个结果，那么就按多个来算，那个倒</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">M</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">符合的意义就在这里）</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; "><font class="Apple-style-span" color="#333333" face="Georgia"></font></p><div><font class="Apple-style-span" color="#333333" face="Georgia"><span style="font-family: Georgia, 'Times New Roman', Times, san-serif; font-size: 14px; "><img title="info(D)=-0.7log_20.7-0.3log_20.3=0.7*0.51+0.3*1.74=0.879" src="http://latex.codecogs.com/gif.latex?info(D)=-0.7log_20.7-0.3log_20.3=0.7*0.51+0.3*1.74=0.879" alt="" style="border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; " /></span></font></div><p>&nbsp;</p><p style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp; &nbsp;&nbsp;</span><span class="Apple-style-span" style="color: #ff6600; font-family: Georgia, 'Times New Roman', Times, san-serif; line-height: 21px; "><img title="info_L(D)=0.3*(-\frac{0}{3}log_2\frac{0}{3}-\frac{3}{3}log_2\frac{3}{3})+0.4*(-\frac{1}{4}log_2\frac{1}{4}-\frac{3}{4}log_2\frac{3}{4})+0.3*(-\frac{1}{3}log_2\frac{1}{3}-\frac{2}{3}log_2\frac{2}{3})=0+0.326+0.277=0.603" src="http://latex.codecogs.com/gif.latex?info_L(D)=0.3*(-\frac{0}{3}log_2\frac{0}{3}-\frac{3}{3}log_2\frac{3}{3})+0.4*(-\frac{1}{4}log_2\frac{1}{4}-\frac{3}{4}log_2\frac{3}{4})+0.3*(-\frac{1}{3}log_2\frac{1}{3}-\frac{2}{3}log_2\frac{2}{3})=0+0.326+0.277=0.603" alt="" style="border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; " /></span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: 宋体; color: #ff6600; "></span></p><div><span style="font-family: Georgia, 'Times New Roman', Times, san-serif; font-size: 14px; "><p style="line-height: 1.8; margin-top: 12px; margin-right: auto; margin-bottom: 12px; margin-left: auto; text-indent: 0px; ">&nbsp;&nbsp; &nbsp; &nbsp;</p></span></div>所以根据公式详细点应该为</span><p>&nbsp;</p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;-</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">（</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">&nbsp;7 / 10&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">）</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">log2( 7 / 10 ) &#8211; ( 3 / 10 ) log2( 3 / 10 )</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">计算日志的熵</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">:</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; "><font class="Apple-style-span" color="#333333" face="Georgia"><div><span style="color: #232323; font-family: verdana, Arial, Helvetica; font-size: 14px; line-height: 28px; "><img title="gain(L)=0.879-0.603=0.276" src="http://latex.codecogs.com/gif.latex?gain(L)=0.879-0.603=0.276" alt="" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; font-family: verdana, Arial, Helvetica; font-size: 12px; color: #232323; border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; border-style: initial; border-color: initial; border-top-style: none; border-right-style: none; border-bottom-style: none; border-left-style: none; border-width: initial; border-color: initial; " /></span></div></font></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">详解</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">:</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">第一项是按日志</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">L</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">划分，即。</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">&nbsp;(3 / 10) * ( - ( 0 / 3 ) log2( 0 / 3 ) &#8211; ( 3 / 3 )log2( 3 / 3 ) ),&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">这个</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">&nbsp;3 / 10&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">是指</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">l</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">值有</span><span style="line-height: 25px; font-family: Georgia; color: #ff6600; ">3</span><span style="line-height: 25px; font-family: 宋体; color: #ff6600; ">个</span>&nbsp;</p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">因此日志密度的信息增益是</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">0.276</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">。</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">用同样方法得到</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">H</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">和</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">F</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">的信息增益分别为</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">0.033</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">和</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">0.553</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">。</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">因为</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">F</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">具有最大的信息增益，所以第一次分裂选择</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">F</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">为分裂属性，分裂后的结果如下图表示：</span></p><div><span style="color: #232323; font-family: verdana, Arial, Helvetica; font-size: 14px; line-height: 28px; "><img src="http://images.cnblogs.com/cnblogs_com/leoo2sk/WindowsLiveWriter/34d255f282ae_B984/3_3.png" border="0" alt="" width="603" height="517" style="margin-top: 0px; margin-right: auto; margin-bottom: 0px; margin-left: auto; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; font-family: verdana, Arial, Helvetica; font-size: 12px; color: #232323; border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; border-style: initial; border-color: initial; border-width: initial; border-color: initial; display: block; float: none; border-style: initial; border-color: initial; " /></span></div><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">在上图的基础上，再递归使用这个方法计算子节点的分裂属性，最终就可以得到整个决策树。</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">上面为了简便，将特征属性离散化了，其实日志密度和好友密度都是连续的属性。对于特征属性为连续值，可以如此使用</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">ID3</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">算法：</span></p><p align="left" style="margin-top: 9pt; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">先将</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">D</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">中元素按照特征属性排序，则每两个相邻元素的中间点可以看做潜在分裂点，从第一个潜在分裂点开始，分裂</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">D</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">并计算两个集合的期望信息，具有最小期望信息的点称为这个属性的最佳分裂点，其信息期望作为此属性的信息期望。</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">底数公式</span></p><pre style="margin-bottom: 7.5pt; line-height: 18pt; font-family: 'lucida Grande', Verdana; "><span style="color: black; background-image: initial; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: white; background-position: initial initial; background-repeat: initial initial; ">&#9312;</span><span style="font-family: Arial; color: black; background-image: initial; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: white; background-position: initial initial; background-repeat: initial initial; ">loga(MN)=loga(M)+loga(N)</span></pre><pre style="margin-bottom: 7.5pt; line-height: 18pt; font-family: 'lucida Grande', Verdana; "><span style="color: black; background-image: initial; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: white; background-position: initial initial; background-repeat: initial initial; ">&#9313;</span><span style="font-family: Arial; color: black; background-image: initial; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: white; background-position: initial initial; background-repeat: initial initial; ">loga(M/N)=loga(M)-loga(N)</span></pre><pre style="margin-bottom: 7.5pt; line-height: 18pt; font-family: 'lucida Grande', Verdana; ">&nbsp;</pre><pre style="margin-bottom: 7.5pt; line-height: 18pt; font-family: 'lucida Grande', Verdana; "><span style="color: black; background-image: initial; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: white; background-position: initial initial; background-repeat: initial initial; ">代码中</span><span style="font-family: Arial; color: black; background-image: initial; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: white; background-position: initial initial; background-repeat: initial initial; ">info()</span><span style="color: black; background-image: initial; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: white; background-position: initial initial; background-repeat: initial initial; ">实现那段函数实质上是经过一推导没有直接用</span></pre><p><font class="Apple-style-span" color="#333333" face="Georgia"><br /></font></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; color: #333333; ">假设现在套用标准的公式，有</span><span style="font-family: Georgia; color: #333333; ">&nbsp;C 1 + C2 = T</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">Info = - ( C1 / T )* log2 ( C 1 / T )&nbsp;&nbsp;-&nbsp;&nbsp;( C2 / T )* log2 ( C 2 / T )</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: 宋体; color: #333333; ">根据底数公式</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">2</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">可以得出</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">Info = -( C1 / T ) * (log2C1 &#8211;log2T) - ( C2 / T ) * (log2C2 &#8211;log2T)</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: 宋体; color: #333333; ">把被除数</span><span style="line-height: 25px; font-family: Georgia; color: #333333; ">T</span><span style="line-height: 25px; font-family: 宋体; color: #333333; ">移出来</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">Info = ( - C1&nbsp;&nbsp;* (log2C1 &#8211;log2T)&nbsp;&nbsp;-&nbsp;&nbsp;C2 * (log2C2 &#8211;log2T) ) /T</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">Info = ( - C1 log2C1&nbsp;&nbsp;+ C1log2T&nbsp;&nbsp;-&nbsp;&nbsp;C2 log2C2&nbsp;&nbsp;+ C2log2T&nbsp;&nbsp;) /T</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">Info = ( - C1 log2C1&nbsp;&nbsp;-&nbsp;&nbsp;C2 log2C2&nbsp;&nbsp;+&nbsp;&nbsp;C1log2T&nbsp;&nbsp;+&nbsp;&nbsp;C2log2T&nbsp;&nbsp;) /T</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">Info = ( - C1 log2C1&nbsp;&nbsp;-&nbsp;&nbsp;C2 log2C2&nbsp;&nbsp;+&nbsp;&nbsp;( C1&nbsp;&nbsp;+&nbsp;&nbsp;C2 ) log2T ) /T</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">C1 + C2 = T</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: Georgia; color: #333333; ">Info = ( - C1 log2C1&nbsp;&nbsp;-&nbsp;&nbsp;C2 log2C2&nbsp;&nbsp;+&nbsp;&nbsp;T log2T ) /T</span></p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: 宋体; color: #333333; ">所以代码里最终是用这个公式来计算的，明显减少了除法运算量</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p align="left" style="margin-top: 9pt; margin-right: 0cm; margin-bottom: 9pt; margin-left: 0cm; text-align: left; line-height: 25px; word-break: break-all; font-family: 'lucida Grande', Verdana; "><span style="line-height: 25px; font-family: 宋体; color: #333333; ">代码中决策树生成的代码分析：</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;&nbsp;static TreeNode* learnNode (const std::string&amp; targetName,</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;const std::string&amp; columnName, const col_t&amp; column,</span></p><p style="font-family: 'lucida Grande', Verdana; "><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;const col_set_t&amp; workingSet, unsigned int threshold);</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; ">targetName:&nbsp;<span style="font-family: 宋体; ">决策树的目标属性</span></p><p style="font-family: 'lucida Grande', Verdana; ">columnName:&nbsp;<span style="font-family: 宋体; ">传入的属性名</span></p><p style="font-family: 'lucida Grande', Verdana; ">column:&nbsp;<span style="font-family: 宋体; ">属性数据</span></p><p style="font-family: 'lucida Grande', Verdana; ">workingSet:&nbsp;<span style="font-family: 宋体; ">分析数据集</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">这个函数完成了决策树的生成。根据熵的分裂理论。要对这个集进行分支划分。以其中最大值作上限，</span>0<span style="font-family: 宋体; ">作下限。在范围（</span>0<span style="font-family: 宋体; ">，最大值）之间每一个值都列为一个分支。这些分支作为本节点的子节点。子节点也要进行这样的处理。一直到&#8220;纯&#8221;的节点为止。</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">在程序实现上使用递归来实现处理</span>.</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">处理步骤</span></p><p style="margin-left: 18pt; text-indent: -18pt; font-family: 'lucida Grande', Verdana; "><span>1.<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span></span><span style="font-family: 宋体; ">设置递归返回点。根据这组属性的&#8220;纯度&#8221;来决定是否为递归终点，在这里返回的是叶子节点的值，就是决策结果。</span></p><p style="margin-left: 18pt; text-indent: -18pt; font-family: 'lucida Grande', Verdana; "><span>2.<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span></span><span style="font-family: 宋体; ">对所有分支进行处理，提取分支所属的数据</span>,&nbsp;<span style="font-family: 宋体; ">获取其中最大增益信息组，并开始递归</span></p><p style="margin-left: 18pt; text-indent: -18pt; font-family: 'lucida Grande', Verdana; "><span>3.<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span></span><span style="font-family: 宋体; ">填充范围值内缺乏分支的节点，取近似的值的分支作为填充。</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">把决策树写成</span>lua<span style="font-family: 宋体; ">配置</span></p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">每个属性的范围划分，这部分数据是从</span>features.dat<span style="font-family: 宋体; ">从拷出来，添加到</span>lua<span style="font-family: 宋体; ">中的。</span></p><p style="font-family: 'lucida Grande', Verdana; ">&nbsp;</p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">这里也应用了递归处理。</span></p><p style="font-family: 'lucida Grande', Verdana; "><span style="font-family: 宋体; ">遍历每个子节点的分支到叶子返回结果。</span></p></div><img src ="http://www.cppblog.com/bookheart/aggbug/161869.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/bookheart/" target="_blank">菜鸟想学飞</a> 2012-05-20 13:29 <a href="http://www.cppblog.com/bookheart/archive/2012/05/20/161869.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>