﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>C++博客-不会飞的鸟</title><link>http://www.cppblog.com/noflybird/</link><description>2010年12月10日 ... 不鸟他们！！！ 我要用自己开发的分布式文件系统、分布式调度系统、分布式检索系统， 做自己的搜索引擎！！！大鱼有大志！！！ ---杨书童</description><language>zh-cn</language><lastBuildDate>Wed, 22 Apr 2026 02:26:32 GMT</lastBuildDate><pubDate>Wed, 22 Apr 2026 02:26:32 GMT</pubDate><ttl>60</ttl><item><title>[转]Win7下使用VS2010编译的Win32控制台应用程序在XP下运行报错：找不到msvcp100d.dll</title><link>http://www.cppblog.com/noflybird/archive/2015/02/26/209876.html</link><dc:creator>不会飞的鸟</dc:creator><author>不会飞的鸟</author><pubDate>Thu, 26 Feb 2015 10:53:00 GMT</pubDate><guid>http://www.cppblog.com/noflybird/archive/2015/02/26/209876.html</guid><wfw:comment>http://www.cppblog.com/noflybird/comments/209876.html</wfw:comment><comments>http://www.cppblog.com/noflybird/archive/2015/02/26/209876.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/noflybird/comments/commentRss/209876.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/noflybird/services/trackbacks/209876.html</trackback:ping><description><![CDATA[<div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #008000; ">//</span><span style="color: #008000; ">targetver.h</span><span style="color: #008000; "><br /></span><br />#pragma&nbsp;once<br /><br /><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;包括&nbsp;SDKDDKVer.h&nbsp;将定义可用的最高版本的&nbsp;Windows&nbsp;平台。<br /><br /></span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;如果要为以前的&nbsp;Windows&nbsp;平台生成应用程序，请包括&nbsp;WinSDKVer.h，并将<br /></span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;WIN32_WINNT&nbsp;宏设置为要支持的平台，然后再包括&nbsp;SDKDDKVer.h。</span><span style="color: #008000; "><br /></span><br />#include&nbsp;&lt;WinSDKVer.h&gt;<br /><br /><span style="color: #0000FF; ">#define</span>&nbsp;_WIN32_WINNT&nbsp;_WIN32_WINNT_WINXP<br /><br />#include&nbsp;&lt;SDKDDKVer.h&gt;</div><br /><p style="color: #333333; font-family: Arial; font-size: 13.6842107772827px; line-height: 26px; background-color: #ffffff;">1、如上：在targetver.h中添加代码</p><p style="color: #333333; font-family: Arial; font-size: 13.6842107772827px; line-height: 26px; background-color: #ffffff;">2、如下：修改运行库</p><img src="http://img.blog.csdn.net/20130614140023046"  alt="" /><br /><br /><img src ="http://www.cppblog.com/noflybird/aggbug/209876.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/noflybird/" target="_blank">不会飞的鸟</a> 2015-02-26 18:53 <a href="http://www.cppblog.com/noflybird/archive/2015/02/26/209876.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>[转]Linux下g++编译C++连接oracle（OCCI）出现的问题及解决方式</title><link>http://www.cppblog.com/noflybird/archive/2014/12/18/209221.html</link><dc:creator>不会飞的鸟</dc:creator><author>不会飞的鸟</author><pubDate>Thu, 18 Dec 2014 05:04:00 GMT</pubDate><guid>http://www.cppblog.com/noflybird/archive/2014/12/18/209221.html</guid><wfw:comment>http://www.cppblog.com/noflybird/comments/209221.html</wfw:comment><comments>http://www.cppblog.com/noflybird/archive/2014/12/18/209221.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/noflybird/comments/commentRss/209221.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/noflybird/services/trackbacks/209221.html</trackback:ping><description><![CDATA[<span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">由于项目原因，开始学习C++，刚接触半个多月，今天参考网上例子，写了个简单的C++连接ORACLE的DEMO，可是使用g++编译时不顺利，不是报这个错就是那个，最后参考网上的解决方式和个人理解，终于调试好了，现把编译中出现的问题和解决方法总结出来。&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp; 源代码&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" />&nbsp;<div style="width: 936.53125px; line-height: 26px;"><div><div style="border-left-color: #999999;">C++代码</div></div><ol><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">#include&nbsp;&lt;iostream&gt;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">#include&nbsp;&lt;string&gt;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">#include&nbsp;"occi.h"&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;"><span style="color: blue;">using</span>&nbsp;<span style="color: blue;">namespace</span>&nbsp;oracle::occi;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;"><span style="color: blue;">using</span>&nbsp;<span style="color: blue;">namespace</span>&nbsp;std;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;"><span style="color: #2e8b57; font-weight: bold;">int</span>&nbsp;main()&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">{&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;string&nbsp;usr=<span style="color: red;">"sys"</span>;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;string&nbsp;pwd=<span style="color: red;">"orcl"</span>;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;string&nbsp;SID=<span style="color: red;">"ORCL"</span>;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;string&nbsp;date;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Environment&nbsp;*env=Environment::createEnvironment(Environment::OBJECT);&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connection&nbsp;*conn=&nbsp;env-&gt;createConnection(usr,pwd,SID);//all&nbsp;strings&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: blue;">if</span>(conn)&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cout&lt;&lt;<span style="color: red;">"success&nbsp;createConnection!"</span>&lt;&lt;endl;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: blue;">else</span>&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cout&lt;&lt;<span style="color: red;">"failure&nbsp;createConnection!"</span>&lt;&lt;endl;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Statement&nbsp;*stmt&nbsp;=&nbsp;conn-&gt;createStatement();&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;string&nbsp;sSQL&nbsp;=&nbsp;<span style="color: red;">"select&nbsp;to_char(sysdate,'yyyy-mm-dd&nbsp;hh24:mi:ss')&nbsp;from&nbsp;dual"</span>;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;stmt-&gt;setSQL(sSQL);&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ResultSet&nbsp;*rs&nbsp;=&nbsp;stmt-&gt;executeQuery();&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: blue;">if</span>(rs-&gt;next())&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;date&nbsp;=&nbsp;rs-&gt;getString(1);&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cout&lt;&lt;<span style="color: red;">"now&nbsp;time&nbsp;:"</span>&lt;&lt;date&lt;&lt;endl;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;env-&gt;terminateConnection(conn);&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Environment::terminateEnvironment(env);&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: blue;">return</span>&nbsp;0;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">}&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;</li></ol></div><div bg_c++"="" style="width: 936.53125px; line-height: 26px;"><div><div style="border-left-color: #999999;"><strong>[c++]</strong>&nbsp;<a href="http://blog.csdn.net/gyanp/article/details/6107044#" title="view plain">view plain</a><a href="http://blog.csdn.net/gyanp/article/details/6107044#" title="copy">copy</a><a href="http://blog.csdn.net/gyanp/article/details/6107044#" title="print">print</a><a href="http://blog.csdn.net/gyanp/article/details/6107044#" title="?">?</a><div style="position: absolute; left: 518px; top: 1425px; width: 29px; height: 14px; z-index: 99;"></div></div></div><ol start="1"><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">#include&nbsp;&lt;iostream&gt;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">#include&nbsp;&lt;string&gt;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">#include&nbsp;"occi.h"&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;"><span style="color: blue;">using</span>&nbsp;<span style="color: blue;">namespace</span>&nbsp;oracle::occi;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;"><span style="color: blue;">using</span>&nbsp;<span style="color: blue;">namespace</span>&nbsp;std;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;"><span style="color: #2e8b57; font-weight: bold;">int</span>&nbsp;main()&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">{&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;string&nbsp;usr=<span style="color: red;">"sys"</span>;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;string&nbsp;pwd=<span style="color: red;">"orcl"</span>;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;string&nbsp;SID=<span style="color: red;">"ORCL"</span>;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;string&nbsp;date;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Environment&nbsp;*env=Environment::createEnvironment(Environment::OBJECT);&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Connection&nbsp;*conn=&nbsp;env-&gt;createConnection(usr,pwd,SID);//all&nbsp;strings&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: blue;">if</span>(conn)&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cout&lt;&lt;<span style="color: red;">"success&nbsp;createConnection!"</span>&lt;&lt;endl;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: blue;">else</span>&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cout&lt;&lt;<span style="color: red;">"failure&nbsp;createConnection!"</span>&lt;&lt;endl;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Statement&nbsp;*stmt&nbsp;=&nbsp;conn-&gt;createStatement();&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;string&nbsp;sSQL&nbsp;=&nbsp;<span style="color: red;">"select&nbsp;to_char(sysdate,'yyyy-mm-dd&nbsp;hh24:mi:ss')&nbsp;from&nbsp;dual"</span>;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;stmt-&gt;setSQL(sSQL);&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ResultSet&nbsp;*rs&nbsp;=&nbsp;stmt-&gt;executeQuery();&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: blue;">if</span>(rs-&gt;next())&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;date&nbsp;=&nbsp;rs-&gt;getString(1);&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cout&lt;&lt;<span style="color: red;">"now&nbsp;time&nbsp;:"</span>&lt;&lt;date&lt;&lt;endl;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;env-&gt;terminateConnection(conn);&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Environment::terminateEnvironment(env);&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: blue;">return</span>&nbsp;0;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">}&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;</li></ol></div><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp; 本人linux上安装oracle路径：/opt/app/oracle/product/10.2.0/db_1&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp; 编译命令：g++ -o conn -L/opt/app/oracle/product/10.2.0/db_1/lib -L/opt/oracle/product/10.2.0/db_1/rdbms/lib conn_db.cpp -g&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">问题一：编译时报如下错误：&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;&nbsp;&nbsp;&nbsp;</span><div style="width: 936.53125px; line-height: 26px;"><div><div style="border-left-color: #999999;">Shell代码</div></div><ol><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[oracle@localhost&nbsp;demo]$&nbsp;g++&nbsp;g++&nbsp;-o&nbsp;conn&nbsp;-L/opt/app/oracle/product/10.2.0/db_1/lib&nbsp;-L/opt/oracle/product/10.2.0/db_1/rdbms/lib&nbsp;-lclntsh&nbsp;-locci&nbsp;/usr/lib/libstdc++.so.5&nbsp;conn_db.cpp&nbsp;-g&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">g++:&nbsp;g++:&nbsp;No&nbsp;such&nbsp;file&nbsp;or&nbsp;directory&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:3:18:&nbsp;error:&nbsp;occi.h:&nbsp;No&nbsp;such&nbsp;file&nbsp;or&nbsp;directory&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:4:&nbsp;error:&nbsp;<span style="color: red;">'oracle'</span>&nbsp;has&nbsp;not&nbsp;been&nbsp;declared&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:4:&nbsp;error:&nbsp;<span style="color: red;">'occi'</span>&nbsp;is&nbsp;not&nbsp;a&nbsp;namespace-name&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:4:&nbsp;error:&nbsp;expected&nbsp;namespace-name&nbsp;before&nbsp;<span style="color: red;">';'</span>&nbsp;token&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:&nbsp;In&nbsp;function&nbsp;<span style="color: red;">'int&nbsp;main()'</span>:&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:14:&nbsp;error:&nbsp;<span style="color: red;">'Environment'</span>&nbsp;was&nbsp;not&nbsp;declared&nbsp;in&nbsp;this&nbsp;scope&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:14:&nbsp;error:&nbsp;<span style="color: red;">'env'</span>&nbsp;was&nbsp;not&nbsp;declared&nbsp;in&nbsp;this&nbsp;scope&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:14:&nbsp;error:&nbsp;<span style="color: red;">'Environment'</span>&nbsp;is&nbsp;not&nbsp;a&nbsp;class&nbsp;or&nbsp;namespace&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:14:&nbsp;error:&nbsp;<span style="color: red;">'Environment'</span>&nbsp;is&nbsp;not&nbsp;a&nbsp;class&nbsp;or&nbsp;namespace&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:15:&nbsp;error:&nbsp;<span style="color: red;">'Connection'</span>&nbsp;was&nbsp;not&nbsp;declared&nbsp;in&nbsp;this&nbsp;scope&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:15:&nbsp;error:&nbsp;<span style="color: red;">'conn'</span>&nbsp;was&nbsp;not&nbsp;declared&nbsp;in&nbsp;this&nbsp;scope&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:21:&nbsp;error:&nbsp;<span style="color: red;">'Statement'</span>&nbsp;was&nbsp;not&nbsp;declared&nbsp;in&nbsp;this&nbsp;scope&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:21:&nbsp;error:&nbsp;<span style="color: red;">'stmt'</span>&nbsp;was&nbsp;not&nbsp;declared&nbsp;in&nbsp;this&nbsp;scope&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:26:&nbsp;error:&nbsp;<span style="color: red;">'ResultSet'</span>&nbsp;was&nbsp;not&nbsp;declared&nbsp;in&nbsp;this&nbsp;scope&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:26:&nbsp;error:&nbsp;<span style="color: red;">'rs'</span>&nbsp;was&nbsp;not&nbsp;declared&nbsp;in&nbsp;this&nbsp;scope&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">conn_db.cpp:35:&nbsp;error:&nbsp;<span style="color: red;">'Environment'</span>&nbsp;is&nbsp;not&nbsp;a&nbsp;class&nbsp;or&nbsp;namespace&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</li></ol></div><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;&nbsp;&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff; color: red;">解决：</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">编译时没有引入OCCI头文件，如果没有，先下载对应的 ORACLE client安装，比如我的是oracle10g,下载了oracle-instantclient-basic- 10.2.0.4-1.i386.zip，解压到一个目录下(/home/oracle/oracle/include)，然后在编译文件的时候引进这个解压目录&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;&nbsp; 编译命令增加OCCI目录：g++ -o conn&nbsp;</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff; color: red;">-I/home/oracle/oracle/include&nbsp;</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">-L/opt/app/oracle/product/10.2.0/db_1/lib -L/opt/oracle/product/10.2.0/db_1/rdbms/lib conn_db.cpp -g&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">问题2：找不到对应函数&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" />&nbsp;<div style="width: 936.53125px; line-height: 26px;"><div><div style="border-left-color: #999999;">Shell代码</div></div><ol><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;[oracle@localhost&nbsp;demo]$&nbsp;g++&nbsp;-o&nbsp;conn&nbsp;-I/home/oracle/oracle/include&nbsp;-L/opt/app/oracle/product/10.2.0/db_1/lib&nbsp;-L/opt/oracle/product/10.2.0/db_1/rdbms/lib&nbsp;conn_db.cpp&nbsp;-Wall&nbsp;-O&nbsp;-g&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">/tmp/cclFs9xq.o:&nbsp;In&nbsp;function&nbsp;`main':&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">/home/oracle/oracle/demo/conn_db.cpp:14:&nbsp;undefined&nbsp;reference&nbsp;to&nbsp;`oracle::occi::Environment::createEnvironment(oracle::occi::Environment::Mode,&nbsp;void*,&nbsp;void*&nbsp;(*)(void*,&nbsp;unsigned&nbsp;int),&nbsp;void*&nbsp;(*)(void*,&nbsp;void*,&nbsp;unsigned&nbsp;int),&nbsp;void&nbsp;(*)(void*,&nbsp;void*))'&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">/home/oracle/oracle/demo/conn_db.cpp:35:&nbsp;undefined&nbsp;reference&nbsp;to&nbsp;`oracle::occi::Environment::terminateEnvironment(oracle::occi::Environment*)'&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">collect2:&nbsp;ld&nbsp;returned&nbsp;1&nbsp;exit&nbsp;status&nbsp;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;</li></ol></div><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff; color: red;">解决：</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">增加libocci.so和libclntsh.so指定编译&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp; 修改后的编译命令： g++ -o conn -I/home/oracle/oracle/include -L/opt/app/oracle/product/10.2.0/db_1/lib -L/opt/oracle/product/10.2.0/db_1/rdbms/lib conn_db.cpp&nbsp;</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff; color: red;">-lclntsh -locci</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;-Wall -O -g&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp; 另外可能在引入-lclntsh -locci编译时可能会报找不到以下错误：&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;&nbsp;&nbsp;&nbsp;</span><div style="width: 936.53125px; line-height: 26px;"><div><div style="border-left-color: #999999;">Shell代码</div></div><ol><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[oracle@localhost&nbsp;demo]$&nbsp;g++&nbsp;-o&nbsp;conn&nbsp;-I/home/oracle/oracle/include&nbsp;-L/opt/app/oracle/product/10.2.0/db_1/lib&nbsp;-L/opt/oracle/product/10.2.0/db_1/rdbms/lib&nbsp;conn_db.cpp&nbsp;-lclntsh&nbsp;-locci&nbsp;/usr/lib/libstdc++.so.5&nbsp;-Wall&nbsp;-O&nbsp;-g&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">/usr/bin/ld:&nbsp;cannot&nbsp;find&nbsp;-lclntsh&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">collect2:&nbsp;ld&nbsp;returned&nbsp;1&nbsp;exit&nbsp;status&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">[oracle@localhost&nbsp;demo]$&nbsp;&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</li></ol></div><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff; color: red;">解决</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">：这是因为没有找到libclntsh.so和libocci.so链接库,你在可以把oracle client安装目录下把这两个文件拷贝到$ORACLE_HOME/lib目录下，或加到/usr/lib目录下就可以了&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">问题三：occi在linux编译运行时报libstdc++.so.6冲突的问题&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;&nbsp;&nbsp;</span><div style="width: 936.53125px; line-height: 26px;"><div><div style="border-left-color: #999999;">Java代码</div></div><ol><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">[oracle<span style="color: #646464;">@localhost</span>&nbsp;demo]$&nbsp;g++&nbsp;-o&nbsp;conn&nbsp;-I/home/oracle/oracle/include&nbsp;-L/opt/app/oracle/product/<span style="color: #c00000;">10.2</span>.<span style="color: #c00000;">0</span>/db_1/lib&nbsp;-L/opt/oracle/product/<span style="color: #c00000;">10.2</span>.<span style="color: #c00000;">0</span>/db_1/rdbms/lib&nbsp;conn_db.cpp&nbsp;-lclntsh&nbsp;-locci&nbsp;-Wall&nbsp;-O&nbsp;-g&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">/usr/bin/ld:&nbsp;warning:&nbsp;libstdc++.so.<span style="color: #c00000;">5</span>,&nbsp;needed&nbsp;by&nbsp;/opt/app/oracle/product/<span style="color: #c00000;">10.2</span>.<span style="color: #c00000;">0</span>/db_1/lib/libocci.so,&nbsp;may&nbsp;conflict&nbsp;with&nbsp;libstdc++.so.<span style="color: #c00000;">6</span>&nbsp;&nbsp;</li></ol></div><div bg_java"="" style="width: 936.53125px; line-height: 26px;"><div><div style="border-left-color: #999999;"><strong>[java]</strong>&nbsp;<a href="http://blog.csdn.net/gyanp/article/details/6107044#" title="view plain">view plain</a><a href="http://blog.csdn.net/gyanp/article/details/6107044#" title="copy">copy</a><a href="http://blog.csdn.net/gyanp/article/details/6107044#" title="print">print</a><a href="http://blog.csdn.net/gyanp/article/details/6107044#" title="?">?</a><div style="position: absolute; left: 520px; top: 3992px; width: 29px; height: 14px; z-index: 99;"></div></div></div><ol start="1"><li style="border-left-color: #999999; background-color: #f5fae2; line-height: 18px;">[oracle<span style="color: #646464;">@localhost</span>&nbsp;demo]$&nbsp;g++&nbsp;-o&nbsp;conn&nbsp;-I/home/oracle/oracle/include&nbsp;-L/opt/app/oracle/product/<span style="color: #c00000;">10.2</span>.<span style="color: #c00000;">0</span>/db_1/lib&nbsp;-L/opt/oracle/product/<span style="color: #c00000;">10.2</span>.<span style="color: #c00000;">0</span>/db_1/rdbms/lib&nbsp;conn_db.cpp&nbsp;-lclntsh&nbsp;-locci&nbsp;-Wall&nbsp;-O&nbsp;-g&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">/usr/bin/ld:&nbsp;warning:&nbsp;libstdc++.so.<span style="color: #c00000;">5</span>,&nbsp;needed&nbsp;by&nbsp;/opt/app/oracle/product/<span style="color: #c00000;">10.2</span>.<span style="color: #c00000;">0</span>/db_1/lib/libocci.so,&nbsp;may&nbsp;conflict&nbsp;with&nbsp;libstdc++.so.<span style="color: #c00000;">6</span>&nbsp;&nbsp;</li></ol></div><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;&nbsp;</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff; color: red;">解决</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">：OCCI库在linux编译的时候，由于linux版本太高，会提示以上情况，实际上，在大多数linux系统上，还保留有libstdc++5的库，自己手工在编译的时候加上去就好了&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp; 修改后的编译命令：g++ -o conn -I/home/oracle/oracle/include -L/opt/app/oracle/product/10.2.0/db_1/lib -L/opt/oracle/product/10.2.0/db_1/rdbms/lib -lclntsh -locci&nbsp;</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff; color: red;">/usr/lib/libstdc++.so.5</span><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;conn_db.cpp -g&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp; 编译通过后执行结果输出：&nbsp;</span><br style="font-family: Arial; line-height: 26px; background-color: #ffffff;" /><span style="font-family: Arial; line-height: 26px; background-color: #ffffff;">&nbsp;&nbsp;</span><div style="font-family: Arial; line-height: 26px; background-color: #ffffff;"><div>Shell代码</div></div><div style="width: 936.53125px; line-height: 26px;"><ol><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">[oracle@localhost&nbsp;demo]$&nbsp;g++&nbsp;-o&nbsp;conn&nbsp;-I/home/oracle/oracle/include&nbsp;-L/opt/app/oracle/product/10.2.0/db_1/lib&nbsp;-L/opt/oracle/product/10.2.0/db_1/rdbms/lib&nbsp;conn_db.cpp&nbsp;-lclntsh&nbsp;-locci&nbsp;/usr/lib/libstdc++.so.5&nbsp;-Wall&nbsp;-O&nbsp;-g&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">[oracle@localhost&nbsp;demo]$&nbsp;./conn&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">success&nbsp;createConnection!&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">now&nbsp;time&nbsp;:2010-11-14&nbsp;22:49:24&nbsp;&nbsp;</li><li style="border-left-color: #999999; background-color: #f5fae2; color: #555555; line-height: 18px;">[oracle@localhost&nbsp;demo]$&nbsp;&nbsp;</li></ol></div><p style="margin: 0px; padding: 0px; font-family: Arial; line-height: 26px; background-color: #ffffff;"><br /></p><img src ="http://www.cppblog.com/noflybird/aggbug/209221.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/noflybird/" target="_blank">不会飞的鸟</a> 2014-12-18 13:04 <a href="http://www.cppblog.com/noflybird/archive/2014/12/18/209221.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>各种字符串Hash函数比较</title><link>http://www.cppblog.com/noflybird/archive/2014/11/10/208836.html</link><dc:creator>不会飞的鸟</dc:creator><author>不会飞的鸟</author><pubDate>Mon, 10 Nov 2014 11:53:00 GMT</pubDate><guid>http://www.cppblog.com/noflybird/archive/2014/11/10/208836.html</guid><wfw:comment>http://www.cppblog.com/noflybird/comments/208836.html</wfw:comment><comments>http://www.cppblog.com/noflybird/archive/2014/11/10/208836.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/noflybird/comments/commentRss/208836.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/noflybird/services/trackbacks/208836.html</trackback:ping><description><![CDATA[<h1><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;">常用的字符串Hash函数还有ELFHash，APHash等等，都是十分简单有效的方法。这些函数使用位运算使得每一个字符都对最后的函数值产生影响。另外还有以MD5和SHA1为代表的杂凑函数，这些函数几乎不可能找到碰撞。</p><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;">常用字符串哈希函数有BKDRHash，APHash，DJBHash，JSHash，RSHash，SDBMHash，PJWHash，ELFHash等等。对于以上几种哈希函数，我对其进行了一个小小的评测。<br /></p><table border="1" style="border-spacing: 0px; width: 593px; margin-bottom: 20px; border-style: solid; border-color: #dddddd; color: #555555; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; line-height: 22.5px; text-align: justify; background-color: #ffffff;"><colgroup><col width="72" span="10"></colgroup><tbody><tr height="19" style="background-color: #fafafa; background-position: initial initial; background-repeat: initial initial;"><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">Hash函数</td><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">数据1</td><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">数据2</td><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">数据3</td><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">数据4</td><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">数据1得分</td><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">数据2得分</td><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">数据3得分</td><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">数据4得分</td><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">平均分</td></tr><tr height="19"><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">BKDRHash</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">2</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">0</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">4774</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">481</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">96.55</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">100</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">90.95</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">82.05</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">92.64</td></tr><tr height="19" style="background-color: #fafafa; background-position: initial initial; background-repeat: initial initial;"><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">APHash</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">2</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">3</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">4754</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">493</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">96.55</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">88.46</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">100</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">51.28</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">86.28</td></tr><tr height="19"><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">DJBHash</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">2</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">2</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">4975</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">474</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">96.55</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">92.31</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">0</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">100</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">83.43</td></tr><tr height="19" style="background-color: #fafafa; background-position: initial initial; background-repeat: initial initial;"><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">JSHash</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">1</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">4</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">4761</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">506</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">100</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">84.62</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">96.83</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">17.95</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">81.94</td></tr><tr height="19"><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">RSHash</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">1</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">0</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">4861</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">505</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">100</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">100</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">51.58</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">20.51</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">75.96</td></tr><tr height="19" style="background-color: #fafafa; background-position: initial initial; background-repeat: initial initial;"><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">SDBMHash</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">3</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">2</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">4849</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">504</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">93.1</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">92.31</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">57.01</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">23.08</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">72.41</td></tr><tr height="19"><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">PJWHash</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">30</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">26</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">4878</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">513</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">0</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">0</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">43.89</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">0</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">21.95</td></tr><tr height="19" style="background-color: #fafafa; background-position: initial initial; background-repeat: initial initial;"><td style="border-style: solid; border-color: #eeeeee; padding: 4px;">ELFHash</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">30</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">26</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">4878</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">513</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">0</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">0</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">43.89</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">0</td><td align="right" style="border-style: solid; border-color: #eeeeee; padding: 4px;">21.95</td></tr></tbody></table><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;"></p><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;">其中数据1为100000个字母和数字组成的随机串哈希冲突个数。数据2为100000个有意义的英文句子哈希冲突个数。数据3为数据1的哈希值与1000003(大素数)求模后存储到线性表中冲突的个数。数据4为数据1的哈希值与10000019(更大素数)求模后存储到线性表中冲突的个数。</p><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;"></p><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;"></p><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;">经过比较，得出以上平均得分。平均数为平方平均数。可以发现，BKDRHash无论是在实际效果还是编码实现中，效果都是最突出的。APHash也是较为优秀的算法。DJBHash,JSHash,RSHash与SDBMHash各有千秋。PJWHash与ELFHash效果最差，但得分相似，其算法本质是相似的。</p><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;"></p><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;"></p><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;">在信息修竞赛中，要本着易于编码调试的原则，个人认为BKDRHash是最适合记忆和使用的。</p><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;"></p><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;"></p><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;">BYVoid原创，欢迎建议、交流、批评和指正。</p><span style="color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;">附：各种哈希函数的C语言程序代码</span><p style="margin-bottom: 1em; color: #444444; font-family: 'Open Sans', 'Helvetica Neue', Arial, Verdana, 'Hiragino Sans GB', STHeiti, SimSun, sans-serif; font-size: 15px; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;"></p><pre style="font-family: Courier, 'Liberation Mono', Consolas, monospace; font-size: 15px; white-space: pre-wrap; word-wrap: break-word; color: #444444; font-weight: normal; line-height: 22.5px; text-align: justify; background-color: #ffffff;"><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #000000; ">unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;SDBMHash(</span><span style="color: #0000FF; ">char</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0</span><span style="color: #000000; ">;<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">while</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;equivalent&nbsp;to:&nbsp;hash&nbsp;=&nbsp;65599*hash&nbsp;+&nbsp;(*str++);</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str</span><span style="color: #000000; ">++</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">+</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&lt;&lt;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">6</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">+</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&lt;&lt;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">16</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">-</span><span style="color: #000000; ">&nbsp;hash;<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">return</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0x7FFFFFFF</span><span style="color: #000000; ">);<br />}<br /><br /></span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;RS&nbsp;Hash&nbsp;Function</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;RSHash(</span><span style="color: #0000FF; ">char</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;b&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">378551</span><span style="color: #000000; ">;<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;a&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">63689</span><span style="color: #000000; ">;<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0</span><span style="color: #000000; ">;<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">while</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;hash&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">&nbsp;a&nbsp;</span><span style="color: #000000; ">+</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str</span><span style="color: #000000; ">++</span><span style="color: #000000; ">);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;a&nbsp;</span><span style="color: #000000; ">*=</span><span style="color: #000000; ">&nbsp;b;<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">return</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0x7FFFFFFF</span><span style="color: #000000; ">);<br />}<br /><br /></span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;JS&nbsp;Hash&nbsp;Function</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;JSHash(</span><span style="color: #0000FF; ">char</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">1315423911</span><span style="color: #000000; ">;<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">while</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">^=</span><span style="color: #000000; ">&nbsp;((hash&nbsp;</span><span style="color: #000000; ">&lt;&lt;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">5</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">+</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str</span><span style="color: #000000; ">++</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">+</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&gt;&gt;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">2</span><span style="color: #000000; ">));<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">return</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0x7FFFFFFF</span><span style="color: #000000; ">);<br />}<br /><br /></span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;P.&nbsp;J.&nbsp;Weinberger&nbsp;Hash&nbsp;Function</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;PJWHash(</span><span style="color: #0000FF; ">char</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;BitsInUnignedInt&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;(unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">)(</span><span style="color: #0000FF; ">sizeof</span><span style="color: #000000; ">(unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">8</span><span style="color: #000000; ">);<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;ThreeQuarters&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;(unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">)((BitsInUnignedInt&nbsp;&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">3</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">/</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">4</span><span style="color: #000000; ">);<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;OneEighth&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;(unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">)(BitsInUnignedInt&nbsp;</span><span style="color: #000000; ">/</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">8</span><span style="color: #000000; ">);<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;HighBits&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;(unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">)(</span><span style="color: #000000; ">0xFFFFFFFF</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">&lt;&lt;</span><span style="color: #000000; ">&nbsp;(BitsInUnignedInt&nbsp;</span><span style="color: #000000; ">-</span><span style="color: #000000; ">&nbsp;OneEighth);<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;hash&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0</span><span style="color: #000000; ">;<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;test&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0</span><span style="color: #000000; ">;<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">while</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&lt;&lt;</span><span style="color: #000000; ">&nbsp;OneEighth)&nbsp;</span><span style="color: #000000; ">+</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str</span><span style="color: #000000; ">++</span><span style="color: #000000; ">);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">if</span><span style="color: #000000; ">&nbsp;((test&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;hash&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;HighBits)&nbsp;</span><span style="color: #000000; ">!=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0</span><span style="color: #000000; ">)<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;((hash&nbsp;</span><span style="color: #000000; ">^</span><span style="color: #000000; ">&nbsp;(test&nbsp;</span><span style="color: #000000; ">&gt;&gt;</span><span style="color: #000000; ">&nbsp;ThreeQuarters))&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">~</span><span style="color: #000000; ">HighBits));<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">return</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0x7FFFFFFF</span><span style="color: #000000; ">);<br />}<br /><br /></span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;ELF&nbsp;Hash&nbsp;Function</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;ELFHash(</span><span style="color: #0000FF; ">char</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0</span><span style="color: #000000; ">;<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;x&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0</span><span style="color: #000000; ">;<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">while</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&lt;&lt;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">4</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">+</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str</span><span style="color: #000000; ">++</span><span style="color: #000000; ">);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">if</span><span style="color: #000000; ">&nbsp;((x&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;hash&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0xF0000000L</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">!=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0</span><span style="color: #000000; ">)<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">^=</span><span style="color: #000000; ">&nbsp;(x&nbsp;</span><span style="color: #000000; ">&gt;&gt;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">24</span><span style="color: #000000; ">);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">&amp;=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">~</span><span style="color: #000000; ">x;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">return</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0x7FFFFFFF</span><span style="color: #000000; ">);<br />}<br /><br /></span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;BKDR&nbsp;Hash&nbsp;Function</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;BKDRHash(</span><span style="color: #0000FF; ">char</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;seed&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">131</span><span style="color: #000000; ">;&nbsp;</span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;31&nbsp;131&nbsp;1313&nbsp;13131&nbsp;131313&nbsp;etc..</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0</span><span style="color: #000000; ">;<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">while</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;hash&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">&nbsp;seed&nbsp;</span><span style="color: #000000; ">+</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str</span><span style="color: #000000; ">++</span><span style="color: #000000; ">);<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">return</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0x7FFFFFFF</span><span style="color: #000000; ">);<br />}<br /><br /></span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;DJB&nbsp;Hash&nbsp;Function</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;DJBHash(</span><span style="color: #0000FF; ">char</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">5381</span><span style="color: #000000; ">;<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">while</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">+=</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&lt;&lt;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">5</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">+</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str</span><span style="color: #000000; ">++</span><span style="color: #000000; ">);<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">return</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0x7FFFFFFF</span><span style="color: #000000; ">);<br />}<br /><br /></span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;AP&nbsp;Hash&nbsp;Function</span><span style="color: #008000; "><br /></span><span style="color: #000000; ">unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;APHash(</span><span style="color: #0000FF; ">char</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str)<br />{<br />&nbsp;&nbsp;&nbsp;&nbsp;unsigned&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;hash&nbsp;</span><span style="color: #000000; ">=</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0</span><span style="color: #000000; ">;<br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">int</span><span style="color: #000000; ">&nbsp;i;<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">for</span><span style="color: #000000; ">&nbsp;(i</span><span style="color: #000000; ">=</span><span style="color: #000000; ">0</span><span style="color: #000000; ">;&nbsp;</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str;&nbsp;i</span><span style="color: #000000; ">++</span><span style="color: #000000; ">)<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">if</span><span style="color: #000000; ">&nbsp;((i&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">1</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">==</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0</span><span style="color: #000000; ">)<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">^=</span><span style="color: #000000; ">&nbsp;((hash&nbsp;</span><span style="color: #000000; ">&lt;&lt;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">7</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">^</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str</span><span style="color: #000000; ">++</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">^</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&gt;&gt;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">3</span><span style="color: #000000; ">));<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">else</span><span style="color: #000000; "><br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;hash&nbsp;</span><span style="color: #000000; ">^=</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">~</span><span style="color: #000000; ">((hash&nbsp;</span><span style="color: #000000; ">&lt;&lt;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">11</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">^</span><span style="color: #000000; ">&nbsp;(</span><span style="color: #000000; ">*</span><span style="color: #000000; ">str</span><span style="color: #000000; ">++</span><span style="color: #000000; ">)&nbsp;</span><span style="color: #000000; ">^</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&gt;&gt;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">5</span><span style="color: #000000; ">)));<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #0000FF; ">return</span><span style="color: #000000; ">&nbsp;(hash&nbsp;</span><span style="color: #000000; ">&amp;</span><span style="color: #000000; ">&nbsp;</span><span style="color: #000000; ">0x7FFFFFFF</span><span style="color: #000000; ">);<br />}</span></div></pre></h1><img src ="http://www.cppblog.com/noflybird/aggbug/208836.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/noflybird/" target="_blank">不会飞的鸟</a> 2014-11-10 19:53 <a href="http://www.cppblog.com/noflybird/archive/2014/11/10/208836.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>[转]C++的Json解析库：jsoncpp和boost </title><link>http://www.cppblog.com/noflybird/archive/2014/05/26/207100.html</link><dc:creator>不会飞的鸟</dc:creator><author>不会飞的鸟</author><pubDate>Sun, 25 May 2014 16:36:00 GMT</pubDate><guid>http://www.cppblog.com/noflybird/archive/2014/05/26/207100.html</guid><wfw:comment>http://www.cppblog.com/noflybird/comments/207100.html</wfw:comment><comments>http://www.cppblog.com/noflybird/archive/2014/05/26/207100.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/noflybird/comments/commentRss/207100.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/noflybird/services/trackbacks/207100.html</trackback:ping><description><![CDATA[<p><span style="font-size: 13px;"><strong>JSON</strong>(JavaScript Object Notation)跟xml一样也是一种数据交换格式，了解json请参考其官网<a href="http://json.org/">http://json.org/</a>，本文不再对json做介绍，将重点介绍c++的json解析库的使用方法。json官网上列出了各种语言对应的json解析库，作者仅介绍自己使用过的两种C++的json解析库:jsoncpp(v0.5.0)和<span id="2KSFindDIV">Boost</span>(v1.34.0)。</span></p><h3><a name="t0"></a> 一. 使用jsoncpp解析json</h3><p><span style="font-size: 13px;">Jsoncpp是个跨平台的开源库，首先从</span><a href="http://jsoncpp.sourceforge.net/"><span style="font-size: 13px;">http://jsoncpp.sourceforge.net/</span></a><span style="font-size: 13px;">上下载jsoncpp库源码，我下载的是v0.5.0，压缩包大约107K，解压，在jsoncpp-src-0.5.0/makefiles/vs71目录里找到jsoncpp.sln，用VS2003及以上版本编译，默认生成静态链接库。 在工程中引用，只需要include/json及.lib文件即可。</span></p><p><span style="font-size: 13px;"> 使用JsonCpp前先来熟悉几个主要的类： </span></p><p><span style="font-size: 13px;">Json::Value     可以表示里所有的类型，比如int,string,object,array等，具体应用将会在后边示例中介绍。</span></p><p><span style="font-size: 13px;">Json::Reader   将json文件流或字符串解析到Json::Value, 主要函数有Parse。</span></p><p><span style="font-size: 13px;">Json::Writer    与Json::Reader相反，将Json::Value转化成字符串流，注意它的两个子类：Json::FastWriter和Json::StyleWriter，分别输出不带格式的json和带格式的json。</span></p><p><span style="font-size: 13px;"> 1. 从字符串解析json</span></p><div bg_cpp"=""><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #0000FF; ">int</span>&nbsp;ParseJsonFromString()<br />{<br />&nbsp;&nbsp;<span style="color: #0000FF; ">const</span>&nbsp;<span style="color: #0000FF; ">char</span>*&nbsp;str&nbsp;=&nbsp;"{\"uploadid\":&nbsp;\"UP000000\",\"code\":&nbsp;100,\"msg\":&nbsp;\"\",\"files\":&nbsp;\"\"}";<br /><br />&nbsp;&nbsp;Json::Reader&nbsp;reader;<br />&nbsp;&nbsp;Json::Value&nbsp;root;<br />&nbsp;&nbsp;<span style="color: #0000FF; ">if</span>&nbsp;(reader.parse(str,&nbsp;root))&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;reader将Json字符串解析到root，root将包含Json里所有子元素</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;std::<span style="color: #0000FF; ">string</span>&nbsp;upload_id&nbsp;=&nbsp;root["uploadid"].asString();&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;访问节点，upload_id&nbsp;=&nbsp;"UP000000"</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">int</span>&nbsp;code&nbsp;=&nbsp;root["code"].asInt();&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;访问节点，code&nbsp;=&nbsp;100</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;}<br />&nbsp;&nbsp;<span style="color: #0000FF; ">return</span>&nbsp;0;<br />}</div></div><pre name="code"><br /></pre><p><span style="font-size: 13px;">2. 从文件解析json<br /></span></p><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />-->{<br />&nbsp;&nbsp;&nbsp;&nbsp;"uploadid":&nbsp;"UP000000",<br />&nbsp;&nbsp;&nbsp;&nbsp;"code":&nbsp;"0",<br />&nbsp;&nbsp;&nbsp;&nbsp;"msg":&nbsp;"",<br />&nbsp;&nbsp;&nbsp;&nbsp;"files":<br />&nbsp;&nbsp;&nbsp;&nbsp;[<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"code":&nbsp;"0",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"msg":&nbsp;"",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"filename":&nbsp;"1D_16-35_1.jpg",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"filesize":&nbsp;"196690",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"width":&nbsp;"1024",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"height":&nbsp;"682",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"images":<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;[<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"url":&nbsp;"fmn061/20111118",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"type":&nbsp;"large",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"width":&nbsp;"720",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"height":&nbsp;"479"<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;},<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"url":&nbsp;"fmn061/20111118",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"type":&nbsp;"main",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"width":&nbsp;"200",<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"height":&nbsp;"133"<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;]<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />&nbsp;&nbsp;&nbsp;&nbsp;]<br />}</div><div>解析代码：<br /><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #0000FF; ">int</span>&nbsp;ParseJsonFromFile(<span style="color: #0000FF; ">const</span>&nbsp;<span style="color: #0000FF; ">char</span>*&nbsp;filename)<br />{<br />&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;解析json用Json::Reader</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;Json::Reader&nbsp;reader;<br />&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;Json::Value是一种很重要的类型，可以代表任意类型。如int,&nbsp;string,&nbsp;object,&nbsp;array<img src="http://www.cppblog.com/Images/dot.gif"  alt="" /></span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;Json::Value&nbsp;root;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br /><br />&nbsp;&nbsp;std::ifstream&nbsp;<span style="color: #0000FF; ">is</span>;<br />&nbsp;&nbsp;<span style="color: #0000FF; ">is</span>.open&nbsp;(filename,&nbsp;std::ios::binary&nbsp;);&nbsp;&nbsp;<br />&nbsp;&nbsp;<span style="color: #0000FF; ">if</span>&nbsp;(reader.parse(<span style="color: #0000FF; ">is</span>,&nbsp;root))<br />&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;std::<span style="color: #0000FF; ">string</span>&nbsp;code;<br />&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">if</span>&nbsp;(!root["files"].isNull())&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;访问节点，Access&nbsp;an&nbsp;object&nbsp;value&nbsp;by&nbsp;name,&nbsp;create&nbsp;a&nbsp;null&nbsp;member&nbsp;if&nbsp;it&nbsp;does&nbsp;not&nbsp;exist.</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;code&nbsp;=&nbsp;root["uploadid"].asString();<br />&nbsp;&nbsp;&nbsp;&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;访问节点，Return&nbsp;the&nbsp;member&nbsp;named&nbsp;key&nbsp;if&nbsp;it&nbsp;exist,&nbsp;defaultValue&nbsp;otherwise.</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;&nbsp;&nbsp;code&nbsp;=&nbsp;root.<span style="color: #0000FF; ">get</span>("uploadid",&nbsp;"null").asString();<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;得到"files"的数组个数</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">int</span>&nbsp;file_size&nbsp;=&nbsp;root["files"].size();<br /><br />&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;遍历数组</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">for</span>(<span style="color: #0000FF; ">int</span>&nbsp;i&nbsp;=&nbsp;0;&nbsp;i&nbsp;&lt;&nbsp;file_size;&nbsp;++i)<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Json::Value&nbsp;val_image&nbsp;=&nbsp;root["files"][i]["images"];<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">int</span>&nbsp;image_size&nbsp;=&nbsp;val_image.size();<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">for</span>(<span style="color: #0000FF; ">int</span>&nbsp;j&nbsp;=&nbsp;0;&nbsp;j&nbsp;&lt;&nbsp;image_size;&nbsp;++j)<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;std::<span style="color: #0000FF; ">string</span>&nbsp;type&nbsp;=&nbsp;val_image[j]["type"].asString();<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;std::<span style="color: #0000FF; ">string</span>&nbsp;url&nbsp;=&nbsp;val_image[j]["url"].asString();<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br />&nbsp;&nbsp;}<br />&nbsp;&nbsp;<span style="color: #0000FF; ">is</span>.close();<br />&nbsp;&nbsp;<span style="color: #0000FF; ">return</span>&nbsp;0;<br />}</div></div><p><span style="font-size: 13px;">3. 在json结构中插入json</span></p><div bg_cpp"=""><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />-->&nbsp; &nbsp; Json::Value&nbsp;arrayObj;&nbsp;&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;构建对象</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;&nbsp;&nbsp;Json::Value&nbsp;new_item,&nbsp;new_item1;<br />&nbsp;&nbsp;&nbsp;&nbsp;new_item["date"]&nbsp;=&nbsp;"2011-12-28";<br />&nbsp;&nbsp;&nbsp;&nbsp;new_item1["time"]&nbsp;=&nbsp;"22:30:36";<br />&nbsp;&nbsp;&nbsp;&nbsp;arrayObj.append(new_item);&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;插入数组成员</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;&nbsp;&nbsp;arrayObj.append(new_item1);&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;插入数组成员</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">int</span>&nbsp;file_size&nbsp;=&nbsp;root["files"].size();<br />&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">for</span>(<span style="color: #0000FF; ">int</span>&nbsp;i&nbsp;=&nbsp;0;&nbsp;i&nbsp;&lt;&nbsp;file_size;&nbsp;++i)<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;root["files"][i]["exifs"]&nbsp;=&nbsp;arrayObj;&nbsp;&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;插入原json中</span></div></div><p><span style="font-size: 13px;"> 4. 输出json</span></p><div bg_cpp"=""><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;转换为字符串（带格式）</span><span style="color: #008000; "><br /></span>std::<span style="color: #0000FF; ">string</span>&nbsp;<span style="color: #0000FF; ">out</span>&nbsp;=&nbsp;root.toStyledString();<br /><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;输出无格式json字符串</span><span style="color: #008000; "><br /></span>Json::FastWriter&nbsp;writer;<br />std::<span style="color: #0000FF; ">string</span>&nbsp;out2&nbsp;=&nbsp;writer.write(root);</div></div><h3><a name="t1"></a>二. 使用<span id="3KSFindDIV">Boost</span> property_tree解析json</h3><p>property_tree可以解析xml，json，ini，info等格式的数据，用property_tree解析这几种格式使用方法很相似。</p><p>解析json很简单，命名空间为<span id="4KSFindDIV">boost</span>::property_tree，reson_json函数将文件流、字符串解析到ptree，write_json将ptree输出为字符串或文件流。其余的都是对ptree的操作。</p><p>解析json需要加头文件：</p><p>#include &lt;<span id="5KSFindDIV">boost</span>/property_tree/ptree.hpp&gt;<br />#include &lt;<span id="6KSFindDIV">boost</span>/property_tree/json_parser.hpp&gt;</p><p>1. 解析json</p><p>解析一段下面的数据：</p><div bg_html"=""><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />-->{<br />&nbsp;&nbsp;"code":&nbsp;0,<br />&nbsp;&nbsp;"images":<br />&nbsp;&nbsp;[<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"url":&nbsp;"fmn057/20111221/1130/head_kJoO_05d9000251de125c.jpg"<br />&nbsp;&nbsp;&nbsp;&nbsp;},<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"url":&nbsp;"fmn057/20111221/1130/original_kJoO_05d9000251de125c.jpg"<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br />&nbsp;&nbsp;]<br />}</div></div><div bg_cpp"=""><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #0000FF; ">int</span>&nbsp;ParseJson()<br />{<br />&nbsp;&nbsp;std::<span style="color: #0000FF; ">string</span>&nbsp;str&nbsp;=&nbsp;"{\"code\":0,\"images\":[{\"url\":\"fmn057/20111221/1130/head_kJoO_05d9000251de125c.jpg\"},{\"url\":\"fmn057/20111221/1130/original_kJoO_05d9000251de125c.jpg\"}]}";<br />&nbsp;&nbsp;<span style="color: #0000FF; ">using</span>&nbsp;<span style="color: #0000FF; ">namespace</span>&nbsp;boost::property_tree;<br /><br />&nbsp;&nbsp;std::stringstream&nbsp;ss(str);<br />&nbsp;&nbsp;ptree&nbsp;pt;<br />&nbsp;&nbsp;<span style="color: #0000FF; ">try</span>{&nbsp;&nbsp;&nbsp;&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;read_json(ss,&nbsp;pt);<br />&nbsp;&nbsp;}<br />&nbsp;&nbsp;<span style="color: #0000FF; ">catch</span>(ptree_error&nbsp;&amp;&nbsp;e)&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">return</span>&nbsp;1;&nbsp;<br />&nbsp;&nbsp;}<br /><br />&nbsp;&nbsp;<span style="color: #0000FF; ">try</span>{<br />&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">int</span>&nbsp;code&nbsp;=&nbsp;pt.<span style="color: #0000FF; ">get</span>&lt;<span style="color: #0000FF; ">int</span>&gt;("code");&nbsp;&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;得到"code"的value</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;&nbsp;&nbsp;ptree&nbsp;image_array&nbsp;=&nbsp;pt.get_child("images");&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;get_child得到数组对象<br />&nbsp;&nbsp;&nbsp;&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;</span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;遍历数组</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;&nbsp;&nbsp;BOOST_FOREACH(boost::property_tree::ptree::value_type&nbsp;&amp;v,&nbsp;image_array)<br />&nbsp;&nbsp;&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;std::stringstream&nbsp;s;<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;write_json(s,&nbsp;v.second);<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;std::<span style="color: #0000FF; ">string</span>&nbsp;image_item&nbsp;=&nbsp;s.str();<br />&nbsp;&nbsp;&nbsp;&nbsp;}<br />&nbsp;&nbsp;}<br />&nbsp;&nbsp;<span style="color: #0000FF; ">catch</span>&nbsp;(ptree_error&nbsp;&amp;&nbsp;e)<br />&nbsp;&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">return</span>&nbsp;2;<br />&nbsp;&nbsp;}<br />&nbsp;&nbsp;<span style="color: #0000FF; ">return</span>&nbsp;0;<br />}</div></div><p>2. 构造json</p><div bg_cpp"=""><div style="background-color:#eeeeee;font-size:13px;border:1px solid #CCCCCC;padding-right: 5px;padding-bottom: 4px;padding-left: 4px;padding-top: 4px;width: 98%;word-break:break-all"><!--<br /><br />Code highlighting produced by Actipro CodeHighlighter (freeware)<br />http://www.CodeHighlighter.com/<br /><br />--><span style="color: #0000FF; ">int</span>&nbsp;InsertJson()<br />{<br />&nbsp;&nbsp;std::<span style="color: #0000FF; ">string</span>&nbsp;str&nbsp;=&nbsp;"{\"code\":0,\"images\":[{\"url\":\"fmn057/20111221/1130/head_kJoO_05d9000251de125c.jpg\"},{\"url\":\"fmn057/20111221/1130/original_kJoO_05d9000251de125c.jpg\"}]}";<br />&nbsp;&nbsp;<span style="color: #0000FF; ">using</span>&nbsp;<span style="color: #0000FF; ">namespace</span>&nbsp;boost::property_tree;<br /><br />&nbsp;&nbsp;std::stringstream&nbsp;ss(str);<br />&nbsp;&nbsp;ptree&nbsp;pt;<br />&nbsp;&nbsp;<span style="color: #0000FF; ">try</span>{&nbsp;&nbsp;&nbsp;&nbsp;<br />&nbsp;&nbsp;&nbsp;&nbsp;read_json(ss,&nbsp;pt);<br />&nbsp;&nbsp;}<br />&nbsp;&nbsp;<span style="color: #0000FF; ">catch</span>(ptree_error&nbsp;&amp;&nbsp;e)&nbsp;{<br />&nbsp;&nbsp;&nbsp;&nbsp;<span style="color: #0000FF; ">return</span>&nbsp;1;&nbsp;<br />&nbsp;&nbsp;}<br /><br />&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;修改/增加一个key-value，key不存在则增加</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;pt.put("upid",&nbsp;"00001");<br /><br />&nbsp;&nbsp;<span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;插入一个数组</span><span style="color: #008000; "><br /></span>&nbsp;&nbsp;ptree&nbsp;exif_array;<br />&nbsp;&nbsp;ptree&nbsp;array1,&nbsp;array2,&nbsp;array3;<br />&nbsp;&nbsp;array1.put("Make",&nbsp;"NIKON");<br />&nbsp;&nbsp;array2.put("DateTime",&nbsp;"2011:05:31&nbsp;06:47:09");<br />&nbsp;&nbsp;array3.put("Software",&nbsp;"Ver.1.01");<br />&nbsp;&nbsp;exif_array.push_back(std::make_pair("",&nbsp;array1));<br />&nbsp;&nbsp;exif_array.push_back(std::make_pair("",&nbsp;array2));<br />&nbsp;&nbsp;exif_array.push_back(std::make_pair("",&nbsp;array3));<br /><br /><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;&nbsp;&nbsp;exif_array.push_back(std::make_pair("Make",&nbsp;"NIKON"));<br /></span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;&nbsp;&nbsp;exif_array.push_back(std::make_pair("DateTime",&nbsp;"2011:05:31&nbsp;06:47:09"));<br /></span><span style="color: #008000; ">//</span><span style="color: #008000; ">&nbsp;&nbsp;&nbsp;exif_array.push_back(std::make_pair("Software",&nbsp;"Ver.1.01"));</span><span style="color: #008000; "><br /></span><br />&nbsp;&nbsp;pt.put_child("exifs",&nbsp;exif_array);<br />&nbsp;&nbsp;std::stringstream&nbsp;s2;<br />&nbsp;&nbsp;write_json(s2,&nbsp;pt);<br />&nbsp;&nbsp;std::<span style="color: #0000FF; ">string</span>&nbsp;outstr&nbsp;=&nbsp;s2.str();<br /><br />&nbsp;&nbsp;<span style="color: #0000FF; ">return</span>&nbsp;0;<br />}</div></div><p> </p><h3><a name="t2"></a>三. 两种解析库的使用经验</h3><p>1. 用<span id="15KSFindDIV">boost</span>::property_tree解析字符串遇到"\/"时解析失败，而jsoncpp可以解析成功，要知道'/'前面加一个'\'是JSON标准格式。</p><p>2. <span id="16KSFindDIV">boost</span>::property_tree的read_json和write_json在多线程中使用会引起崩溃。</p><p>针对1，可以在使用<span id="17KSFindDIV">boost</span>::property_tree解析前写个函数去掉"\/"中的'\'，针对2，在多线程中同步一下可以解决。</p><p>我的使用心得：使用<span id="18KSFindDIV">boost</span>::property_tree不仅可以解析json，还可以解析xml，info等格式的数据。对于解析json，使用<span id="19KSFindDIV">boost</span>::property_tree解析还可以忍受，但解析xml，由于遇到问题太多只能换其它库了。</p><img src ="http://www.cppblog.com/noflybird/aggbug/207100.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/noflybird/" target="_blank">不会飞的鸟</a> 2014-05-26 00:36 <a href="http://www.cppblog.com/noflybird/archive/2014/05/26/207100.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>[转]字符串匹配的Boyer-Moore算法</title><link>http://www.cppblog.com/noflybird/archive/2014/03/06/206080.html</link><dc:creator>不会飞的鸟</dc:creator><author>不会飞的鸟</author><pubDate>Thu, 06 Mar 2014 13:47:00 GMT</pubDate><guid>http://www.cppblog.com/noflybird/archive/2014/03/06/206080.html</guid><wfw:comment>http://www.cppblog.com/noflybird/comments/206080.html</wfw:comment><comments>http://www.cppblog.com/noflybird/archive/2014/03/06/206080.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/noflybird/comments/commentRss/206080.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/noflybird/services/trackbacks/206080.html</trackback:ping><description><![CDATA[<p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">上一篇文章，我介绍了<a href="http://www.ruanyifeng.com/blog/2013/05/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm.html" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">KMP算法</a>。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">但是，它并不是效率最高的算法，实际采用并不多。各种文本编辑器的"查找"功能（Ctrl+F），大多采用<a href="http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">Boyer-Moore算法</a>。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050301.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">Boyer-Moore算法不仅效率高，而且构思巧妙，容易理解。1977年，德克萨斯大学的Robert S. Boyer教授和J Strother Moore教授发明了这种算法。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">下面，我根据Moore教授自己的<a href="http://www.cs.utexas.edu/~moore/best-ideas/string-searching/fstrpos-example.html" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">例子</a>来解释这种算法。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">1.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050302.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">假定字符串为"HERE IS A SIMPLE EXAMPLE"，搜索词为"EXAMPLE"。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">2.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050303.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">首先，"字符串"与"搜索词"头部对齐，从尾部开始比较。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">这是一个很聪明的想法，因为如果尾部字符不匹配，那么只要一次比较，就可以知道前7个字符（整体上）肯定不是要找的结果。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">我们看到，"S"与"E"不匹配。这时，<span style="font-weight: 800;">"S"就被称为"坏字符"（bad character），即不匹配的字符。</span>我们还发现，"S"不包含在搜索词"EXAMPLE"之中，这意味着可以把搜索词直接移到"S"的后一位。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">3.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050304.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">依然从尾部开始比较，发现"P"与"E"不匹配，所以"P"是"坏字符"。但是，"P"包含在搜索词"EXAMPLE"之中。所以，将搜索词后移两位，两个"P"对齐。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">4.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050305.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">我们由此总结出<span style="font-weight: 800;">"坏字符规则"</span>：</p><blockquote style="margin: 2em; padding: 1em; list-style-type: none; border-width: 1em; border-color: #e0dfcc; color: #111111; background-color: #f5f2f0; font-family: Consolas, Monaco, 'Andale Mono', monospace; border-top-left-radius: 20px; border-top-right-radius: 20px; border-bottom-right-radius: 20px; border-bottom-left-radius: 20px; text-shadow: white 0px 1px; font-size: 12px; line-height: 21.59375px; word-spacing: 2px;"><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　后移位数 = 坏字符的位置 - 搜索词中的上一次出现位置</p></blockquote><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">如果"坏字符"不包含在搜索词之中，则上一次出现位置为 -1。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">以"P"为例，它作为"坏字符"，出现在搜索词的第6位（从0开始编号），在搜索词中的上一次出现位置为4，所以后移 6 - 4 = 2位。再以前面第二步的"S"为例，它出现在第6位，上一次出现位置是 -1（即未出现），则整个搜索词后移 6 - (-1) = 7位。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">5.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050306.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">依然从尾部开始比较，"E"与"E"匹配。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">6.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050307.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">比较前面一位，"LE"与"LE"匹配。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">7.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050308.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">比较前面一位，"PLE"与"PLE"匹配。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">8.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050309.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">比较前面一位，"MPLE"与"MPLE"匹配。<span style="font-weight: 800;">我们把这种情况称为"好后缀"（good suffix），即所有尾部匹配的字符串。</span>注意，"MPLE"、"PLE"、"LE"、"E"都是好后缀。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">9.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050310.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">比较前一位，发现"I"与"A"不匹配。所以，"I"是"坏字符"。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">10.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050311.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">根据"坏字符规则"，此时搜索词应该后移 2 - （-1）= 3 位。问题是，此时有没有更好的移法？</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">11.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050309.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">我们知道，此时存在"好后缀"。所以，可以采用<span style="font-weight: 800;">"好后缀规则"</span>：</p><blockquote style="margin: 2em; padding: 1em; list-style-type: none; border-width: 1em; border-color: #e0dfcc; color: #111111; background-color: #f5f2f0; font-family: Consolas, Monaco, 'Andale Mono', monospace; border-top-left-radius: 20px; border-top-right-radius: 20px; border-bottom-right-radius: 20px; border-bottom-left-radius: 20px; text-shadow: white 0px 1px; font-size: 12px; line-height: 21.59375px; word-spacing: 2px;"><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　后移位数 = 好后缀的位置 - 搜索词中的上一次出现位置</p></blockquote><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">举例来说，如果字符串"ABCDAB"的后一个"AB"是"好后缀"。那么它的位置是5（从0开始计算，取最后的"B"的值），在"搜索词中的上一次出现位置"是1（第一个"B"的位置），所以后移 5 - 1 = 4位，前一个"AB"移到后一个"AB"的位置。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">再举一个例子，如果字符串"ABCDEF"的"EF"是好后缀，则"EF"的位置是5 ，上一次出现的位置是 -1（即未出现），所以后移 5 - (-1) = 6位，即整个字符串移到"F"的后一位。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">这个规则有三个注意点：</p><blockquote style="margin: 2em; padding: 1em; list-style-type: none; border-width: 1em; border-color: #e0dfcc; color: #111111; background-color: #f5f2f0; font-family: Consolas, Monaco, 'Andale Mono', monospace; border-top-left-radius: 20px; border-top-right-radius: 20px; border-bottom-right-radius: 20px; border-bottom-left-radius: 20px; text-shadow: white 0px 1px; font-size: 12px; line-height: 21.59375px; word-spacing: 2px;"><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　（1）"好后缀"的位置以最后一个字符为准。假定"ABCDEF"的"EF"是好后缀，则它的位置以"F"为准，即5（从0开始计算）。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　（2）如果"好后缀"在搜索词中只出现一次，则它的上一次出现位置为 -1。比如，"EF"在"ABCDEF"之中只出现一次，则它的上一次出现位置为-1（即未出现）。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　（3）如果"好后缀"有多个，则除了最长的那个"好后缀"，其他"好后缀"的上一次出现位置必须在头部。比如，假定"BABCDAB"的"好后缀"是"DAB"、"AB"、"B"，请问这时"好后缀"的上一次出现位置是什么？回答是，此时采用的好后缀是"B"，它的上一次出现位置是头部，即第0位。这个规则也可以这样表达：如果最长的那个"好后缀"只出现一次，则可以把搜索词改写成如下形式进行位置计算"(DA)BABCDAB"，即虚拟加入最前面的"DA"。</p></blockquote><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">回到上文的这个例子。此时，所有的"好后缀"（MPLE、PLE、LE、E）之中，只有"E"在"EXAMPLE"还出现在头部，所以后移 6 - 0 = 6位。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">12.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050312.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">可以看到，"坏字符规则"只能移3位，"好后缀规则"可以移6位。所以，<span style="font-weight: 800;">Boyer-Moore算法的基本思想是，每次后移这两个规则之中的较大值。</span></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">更巧妙的是，这两个规则的移动位数，只与搜索词有关，与原字符串无关。因此，可以预先计算生成《坏字符规则表》和《好后缀规则表》。使用时，只要查表比较一下就可以了。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">13.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050313.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">继续从尾部开始比较，"P"与"E"不匹配，因此"P"是"坏字符"。根据"坏字符规则"，后移 6 - 4 = 2位。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">14.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050314.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">从尾部开始逐位比较，发现全部匹配，于是搜索结束。如果还要继续查找（即找出全部匹配），则根据"好后缀规则"，后移 6 - 0 = 6位，即头部的"E"移到尾部的"E"的位置。</p><img src ="http://www.cppblog.com/noflybird/aggbug/206080.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/noflybird/" target="_blank">不会飞的鸟</a> 2014-03-06 21:47 <a href="http://www.cppblog.com/noflybird/archive/2014/03/06/206080.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>[转]字符串匹配的KMP算法</title><link>http://www.cppblog.com/noflybird/archive/2014/03/06/206079.html</link><dc:creator>不会飞的鸟</dc:creator><author>不会飞的鸟</author><pubDate>Thu, 06 Mar 2014 13:46:00 GMT</pubDate><guid>http://www.cppblog.com/noflybird/archive/2014/03/06/206079.html</guid><wfw:comment>http://www.cppblog.com/noflybird/comments/206079.html</wfw:comment><comments>http://www.cppblog.com/noflybird/archive/2014/03/06/206079.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/noflybird/comments/commentRss/206079.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/noflybird/services/trackbacks/206079.html</trackback:ping><description><![CDATA[<p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><a href="http://en.wikipedia.org/wiki/String_searching_algorithm" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">字符串匹配</a>是计算机的基本任务之一。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">举例来说，有一个字符串"BBC ABCDAB ABCDABCDABDE"，我想知道，里面是否包含另一个字符串"ABCDABD"？</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050101.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">许多算法可以完成这个任务，<a href="http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">Knuth-Morris-Pratt算法</a>（简称KMP）是最常用的之一。它以三个发明者命名，起头的那个K就是著名科学家Donald Knuth。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050102.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">这种算法不太容易理解，网上有很多<a href="http://www.google.com/search?q=Knuth-Morris-Pratt+algorithm" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">解释</a>，但读起来都很费劲。直到读到<a href="http://jakeboxer.com/blog/2009/12/13/the-knuth-morris-pratt-algorithm-in-my-own-words/" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">Jake Boxer</a>的文章，我才真正理解这种算法。下面，我用自己的语言，试图写一篇比较好懂的KMP算法解释。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">1.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050103.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">首先，字符串"BBC ABCDAB ABCDABCDABDE"的第一个字符与搜索词"ABCDABD"的第一个字符，进行比较。因为B与A不匹配，所以搜索词后移一位。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">2.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050104.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">因为B与A不匹配，搜索词再往后移。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">3.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050105.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">就这样，直到字符串有一个字符，与搜索词的第一个字符相同为止。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">4.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050106.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">接着比较字符串和搜索词的下一个字符，还是相同。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">5.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050107.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">直到字符串有一个字符，与搜索词对应的字符不相同为止。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">6.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050108.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">这时，最自然的反应是，将搜索词整个后移一位，再从头逐个比较。这样做虽然可行，但是效率很差，因为你要把"搜索位置"移到已经比较过的位置，重比一遍。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">7.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050107.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">一个基本事实是，当空格与D不匹配时，你其实知道前面六个字符是"ABCDAB"。KMP算法的想法是，设法利用这个已知信息，不要把"搜索位置"移回已经比较过的位置，继续把它向后移，这样就提高了效率。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">8.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050109.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">怎么做到这一点呢？可以针对搜索词，算出一张《部分匹配表》（Partial Match Table）。这张表是如何产生的，后面再介绍，这里只要会用就可以了。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">9.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050107.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">已知空格与D不匹配时，前面六个字符"ABCDAB"是匹配的。查表可知，最后一个匹配字符B对应的"部分匹配值"为2，因此按照下面的公式算出向后移动的位数：</p><blockquote style="margin: 2em; padding: 1em; list-style-type: none; border-width: 1em; border-color: #e0dfcc; color: #111111; background-color: #f5f2f0; font-family: Consolas, Monaco, 'Andale Mono', monospace; border-top-left-radius: 20px; border-top-right-radius: 20px; border-bottom-right-radius: 20px; border-bottom-left-radius: 20px; text-shadow: white 0px 1px; font-size: 12px; line-height: 21.59375px; word-spacing: 2px;"><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　移动位数 = 已匹配的字符数 - 对应的部分匹配值</p></blockquote><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">因为 6 - 2 等于4，所以将搜索词向后移动4位。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">10.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050110.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">因为空格与Ｃ不匹配，搜索词还要继续往后移。这时，已匹配的字符数为2（"AB"），对应的"部分匹配值"为0。所以，移动位数 = 2 - 0，结果为 2，于是将搜索词向后移2位。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">11.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050111.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">因为空格与A不匹配，继续后移一位。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">12.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050112.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">逐位比较，直到发现C与D不匹配。于是，移动位数 = 6 - 2，继续将搜索词向后移动4位。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">13.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050113.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">逐位比较，直到搜索词的最后一位，发现完全匹配，于是搜索完成。如果还要继续搜索（即找出全部匹配），移动位数 = 7 - 0，再将搜索词向后移动7位，这里就不再重复了。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">14.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050114.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">下面介绍《部分匹配表》是如何产生的。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">首先，要了解两个概念："前缀"和"后缀"。 "前缀"指除了最后一个字符以外，一个字符串的全部头部组合；"后缀"指除了第一个字符以外，一个字符串的全部尾部组合。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">15.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050109.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">"部分匹配值"就是"前缀"和"后缀"的最长的共有元素的长度。以"ABCDABD"为例，</p><blockquote style="margin: 2em; padding: 1em; list-style-type: none; border-width: 1em; border-color: #e0dfcc; color: #111111; background-color: #f5f2f0; font-family: Consolas, Monaco, 'Andale Mono', monospace; border-top-left-radius: 20px; border-top-right-radius: 20px; border-bottom-right-radius: 20px; border-bottom-left-radius: 20px; text-shadow: white 0px 1px; font-size: 12px; line-height: 21.59375px; word-spacing: 2px;"><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　－　"A"的前缀和后缀都为空集，共有元素的长度为0；</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　－　"AB"的前缀为[A]，后缀为[B]，共有元素的长度为0；</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　－　"ABC"的前缀为[A, AB]，后缀为[BC, C]，共有元素的长度0；</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　－　"ABCD"的前缀为[A, AB, ABC]，后缀为[BCD, CD, D]，共有元素的长度为0；</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　－　"ABCDA"的前缀为[A, AB, ABC, ABCD]，后缀为[BCDA, CDA, DA, A]，共有元素为"A"，长度为1；</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　－　"ABCDAB"的前缀为[A, AB, ABC, ABCD, ABCDA]，后缀为[BCDAB, CDAB, DAB, AB, B]，共有元素为"AB"，长度为2；</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　－　"ABCDABD"的前缀为[A, AB, ABC, ABCD, ABCDA, ABCDAB]，后缀为[BCDABD, CDABD, DABD, ABD, BD, D]，共有元素的长度为0。</p></blockquote><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">16.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201305/bg2013050112.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">"部分匹配"的实质是，有时候，字符串头部和尾部会有重复。比如，"ABCDAB"之中有两个"AB"，那么它的"部分匹配值"就是2（"AB"的长度）。搜索词移动的时候，第一个"AB"向后移动4位（字符串长度-部分匹配值），就可以来到第二个"AB"的位置。</p><img src ="http://www.cppblog.com/noflybird/aggbug/206079.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/noflybird/" target="_blank">不会飞的鸟</a> 2014-03-06 21:46 <a href="http://www.cppblog.com/noflybird/archive/2014/03/06/206079.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>[转]进程与线程的一个简单解释</title><link>http://www.cppblog.com/noflybird/archive/2014/03/06/206078.html</link><dc:creator>不会飞的鸟</dc:creator><author>不会飞的鸟</author><pubDate>Thu, 06 Mar 2014 13:44:00 GMT</pubDate><guid>http://www.cppblog.com/noflybird/archive/2014/03/06/206078.html</guid><wfw:comment>http://www.cppblog.com/noflybird/comments/206078.html</wfw:comment><comments>http://www.cppblog.com/noflybird/archive/2014/03/06/206078.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/noflybird/comments/commentRss/206078.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/noflybird/services/trackbacks/206078.html</trackback:ping><description><![CDATA[<p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><a href="https://zh.wikipedia.org/zh-cn/%E8%BF%9B%E7%A8%8B" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">进程</a>（process）和<a href="https://zh.wikipedia.org/zh-cn/%E7%BA%BF%E7%A8%8B" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">线程</a>（thread）是操作系统的基本概念，但是它们比较抽象，不容易掌握。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">最近，我读到一篇<a href="http://www.qnx.com/developers/docs/6.4.1/neutrino/getting_started/s1_procs.html" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">材料</a>，发现有一个很好的类比，可以把它们解释地清晰易懂。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">1.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201304/bg2013042401.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">计算机的核心是CPU，它承担了所有的计算任务。它就像一座工厂，时刻在运行。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">2.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201304/bg2013042402.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">假定工厂的电力有限，一次只能供给一个车间使用。也就是说，一个车间开工的时候，其他车间都必须停工。背后的含义就是，单个CPU一次只能运行一个任务。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">3.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201304/bg2013042403.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">进程就好比工厂的车间，它代表CPU所能处理的单个任务。任一时刻，CPU总是运行一个进程，其他进程处于非运行状态。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">4.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201304/bg2013042404.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">一个车间里，可以有很多工人。他们协同完成一个任务。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">5.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201304/bg2013042405.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">线程就好比车间里的工人。一个进程可以包括多个线程。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">6.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201304/bg2013042406.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">车间的空间是工人们共享的，比如许多房间是每个工人都可以进出的。这象征一个进程的内存空间是共享的，每个线程都可以使用这些共享内存。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">7.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201304/bg2013042407.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">可是，每间房间的大小不同，有些房间最多只能容纳一个人，比如厕所。里面有人的时候，其他人就不能进去了。这代表一个线程使用某些共享内存时，其他线程必须等它结束，才能使用这一块内存。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">8.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201304/bg2013042408.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">一个防止他人进入的简单方法，就是门口加一把锁。先到的人锁上门，后到的人看到上锁，就在门口排队，等锁打开再进去。这就叫<a href="http://zh.wikipedia.org/wiki/%E4%BA%92%E6%96%A5%E9%94%81" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">"互斥锁"</a>（Mutual exclusion，缩写 Mutex），防止多个线程同时读写某一块内存区域。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">9.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201304/bg2013042409.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">还有些房间，可以同时容纳n个人，比如厨房。也就是说，如果人数大于n，多出来的人只能在外面等着。这好比某些内存区域，只能供给固定数目的线程使用。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">10.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201304/bg2013042410.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">这时的解决方法，就是在门口挂n把钥匙。进去的人就取一把钥匙，出来时再把钥匙挂回原处。后到的人发现钥匙架空了，就知道必须在门口排队等着了。这种做法叫做<a href="http://en.wikipedia.org/wiki/Semaphore_(programming)" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">"信号量"</a>（Semaphore），用来保证多个线程不会互相冲突。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">不难看出，mutex是semaphore的一种特殊情况（n=1时）。也就是说，完全可以用后者替代前者。但是，因为mutex较为简单，且效率高，所以在必须保证资源独占的情况下，还是采用这种设计。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">11.</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201304/bg2013042411.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">操作系统的设计，因此可以归结为三点：</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">（1）以多进程形式，允许多个任务同时运行；</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">（2）以多线程形式，允许单个任务分成不同的部分运行；</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">（3）提供协调机制，一方面防止进程之间和线程之间产生冲突，另一方面允许进程之间和线程之间共享资源。</p><img src ="http://www.cppblog.com/noflybird/aggbug/206078.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/noflybird/" target="_blank">不会飞的鸟</a> 2014-03-06 21:44 <a href="http://www.cppblog.com/noflybird/archive/2014/03/06/206078.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>[转]相似图片搜索的原理（二）</title><link>http://www.cppblog.com/noflybird/archive/2014/03/06/206077.html</link><dc:creator>不会飞的鸟</dc:creator><author>不会飞的鸟</author><pubDate>Thu, 06 Mar 2014 13:42:00 GMT</pubDate><guid>http://www.cppblog.com/noflybird/archive/2014/03/06/206077.html</guid><wfw:comment>http://www.cppblog.com/noflybird/comments/206077.html</wfw:comment><comments>http://www.cppblog.com/noflybird/archive/2014/03/06/206077.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/noflybird/comments/commentRss/206077.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/noflybird/services/trackbacks/206077.html</trackback:ping><description><![CDATA[<p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">昨天，我在<a href="http://www.isnowfy.com/similar-image-search/" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">isnowfy</a>的网站看到，还有其他两种方法也很简单，这里做一些笔记。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201303/bg2013033102.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><span style="font-weight: 800;">一、颜色分布法</span></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">每张图片都可以生成<a href="http://en.wikipedia.org/wiki/Color_histogram" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">颜色分布的直方图</a>（color histogram）。如果两张图片的直方图很接近，就可以认为它们很相似。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201303/bg2013033103.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">任何一种颜色都是由红绿蓝三原色（RGB）构成的，所以上图共有4张直方图（三原色直方图 + 最后合成的直方图）。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">如果每种原色都可以取256个值，那么整个颜色空间共有1600万种颜色（256的三次方）。针对这1600万种颜色比较直方图，计算量实在太大了，因此需要采用简化方法。可以将0～255分成四个区：0～63为第0区，64～127为第1区，128～191为第2区，192～255为第3区。这意味着红绿蓝分别有4个区，总共可以构成64种组合（4的3次方）。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">任何一种颜色必然属于这64种组合中的一种，这样就可以统计每一种组合包含的像素数量。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201303/bg2013033105.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">上图是某张图片的颜色分布表，将表中最后一栏提取出来，组成一个64维向量(7414, 230, 0, 0, 8, ..., 109, 0, 0, 3415, 53929)。这个向量就是这张图片的特征值或者叫"指纹"。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">于是，寻找相似图片就变成了找出与其最相似的向量。这可以用<a href="http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">皮尔逊相关系数</a>或者<a href="http://www.ruanyifeng.com/blog/2013/03/cosine_similarity.html" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">余弦相似度</a>算出。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><span style="font-weight: 800;">二、内容特征法</span></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">除了颜色构成，还可以从比较图片内容的相似性入手。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">首先，将原图转成一张较小的灰度图片，假定为50x50像素。然后，确定一个阈值，将灰度图片转成黑白图片。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201303/bg2013033106.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" />&nbsp;<img src="http://image.beekka.com/blog/201303/bg2013033108.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" />&nbsp;<img src="http://image.beekka.com/blog/201303/bg2013033107.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">如果两张图片很相似，它们的黑白轮廓应该是相近的。于是，问题就变成了，第一步如何确定一个合理的阈值，正确呈现照片中的轮廓？</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">显然，前景色与背景色反差越大，轮廓就越明显。这意味着，如果我们找到一个值，可以使得前景色和背景色各自的"类内差异最小"（minimizing the intra-class variance），或者"类间差异最大"（maximizing the inter-class variance），那么这个值就是理想的阈值。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">1979年，日本学者大津展之证明了，"类内差异最小"与"类间差异最大"是同一件事，即对应同一个阈值。他提出一种简单的算法，可以求出这个阈值，这被称为<a href="http://en.wikipedia.org/wiki/Otsu's_method" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">"大津法"</a>（Otsu's method）。下面就是他的计算方法。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">假定一张图片共有n个像素，其中灰度值小于阈值的像素为 n1 个，大于等于阈值的像素为 n2 个（ n1 + n2 = n ）。w1 和 w2 表示这两种像素各自的比重。</p><blockquote style="margin: 2em; padding: 1em; list-style-type: none; border-width: 1em; border-color: #e0dfcc; color: #111111; background-color: #f5f2f0; font-family: Consolas, Monaco, 'Andale Mono', monospace; border-top-left-radius: 20px; border-top-right-radius: 20px; border-bottom-right-radius: 20px; border-bottom-left-radius: 20px; text-shadow: white 0px 1px; font-size: 12px; line-height: 21.59375px; word-spacing: 2px;"><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　w1 = n1 / n</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　w2 = n2 / n</p></blockquote><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">再假定，所有灰度值小于阈值的像素的平均值和方差分别为 &#956;1 和 &#963;1，所有灰度值大于等于阈值的像素的平均值和方差分别为 &#956;2 和 &#963;2。于是，可以得到</p><blockquote style="margin: 2em; padding: 1em; list-style-type: none; border-width: 1em; border-color: #e0dfcc; color: #111111; background-color: #f5f2f0; font-family: Consolas, Monaco, 'Andale Mono', monospace; border-top-left-radius: 20px; border-top-right-radius: 20px; border-bottom-right-radius: 20px; border-bottom-left-radius: 20px; text-shadow: white 0px 1px; font-size: 12px; line-height: 21.59375px; word-spacing: 2px;"><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　类内差异 = w1(&#963;1的平方) + w2(&#963;2的平方)</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　类间差异 = w1w2(&#956;1-&#956;2)^2</p></blockquote><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">可以证明，这两个式子是等价的：得到"类内差异"的最小值，等同于得到"类间差异"的最大值。不过，从计算难度看，后者的计算要容易一些。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">下一步用"穷举法"，将阈值从灰度的最低值到最高值，依次取一遍，分别代入上面的算式。使得"类内差异最小"或"类间差异最大"的那个值，就是最终的阈值。具体的实例和Java算法，请看<a href="http://www.labbookpages.co.uk/software/imgProc/otsuThreshold.html" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">这里</a>。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201303/bg2013033109.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">有了50x50像素的黑白缩略图，就等于有了一个50x50的0-1矩阵。矩阵的每个值对应原图的一个像素，0表示黑色，1表示白色。这个矩阵就是一张图片的特征矩阵。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">两个特征矩阵的不同之处越少，就代表两张图片越相似。这可以用"异或运算"实现（即两个值之中只有一个为1，则运算结果为1，否则运算结果为0）。对不同图片的特征矩阵进行"异或运算"，结果中的1越少，就是越相似的图片。</p><img src ="http://www.cppblog.com/noflybird/aggbug/206077.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/noflybird/" target="_blank">不会飞的鸟</a> 2014-03-06 21:42 <a href="http://www.cppblog.com/noflybird/archive/2014/03/06/206077.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>[转]相似图片搜索的原理（一）</title><link>http://www.cppblog.com/noflybird/archive/2014/03/06/206076.html</link><dc:creator>不会飞的鸟</dc:creator><author>不会飞的鸟</author><pubDate>Thu, 06 Mar 2014 13:42:00 GMT</pubDate><guid>http://www.cppblog.com/noflybird/archive/2014/03/06/206076.html</guid><wfw:comment>http://www.cppblog.com/noflybird/comments/206076.html</wfw:comment><comments>http://www.cppblog.com/noflybird/archive/2014/03/06/206076.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/noflybird/comments/commentRss/206076.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/noflybird/services/trackbacks/206076.html</trackback:ping><description><![CDATA[<p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">上个月，Google把<a href="http://www.google.com/insidesearch/searchbyimage.html" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">"相似图片搜索"</a>正式放上了首页。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">你可以用一张图片，搜索互联网上所有与它相似的图片。点击<a href="http://images.google.com.hk/" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">搜索框</a>中照相机的图标。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><a href="http://images.google.com.hk/" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;"><img src="http://image.beekka.com/blog/201107/bg2011072101.png" style="margin: 0px; padding: 0px; list-style-type: none; text-decoration: none; border: 0.7em solid #e0dfcc; color: #111111; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></a></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">一个对话框会出现。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201107/bg2011072102.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">你输入网片的网址，或者直接上传图片，Google就会找出与其相似的图片。下面这张图片是美国女演员Alyson Hannigan。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201107/bg2011072103.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">上传后，Google返回如下结果：</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201107/bg2011072104.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">类似的"相似图片搜索引擎"还有不少，<a href="http://www.tineye.com/" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">TinEye</a>甚至可以找出照片的拍摄背景。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201107/bg2011072105.jpg" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">==========================================================</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">这种技术的原理是什么？计算机怎么知道两张图片相似呢？</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">根据<a href="http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">Neal Krawetz</a>博士的解释，原理非常简单易懂。我们可以用一个快速算法，就达到基本的效果。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">这里的关键技术叫做"感知哈希算法"（Perceptual hash algorithm），它的作用是对每张图片生成一个"指纹"（fingerprint）字符串，然后比较不同图片的指纹。结果越接近，就说明图片越相似。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">下面是一个最简单的实现：</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><span style="font-weight: 800;">第一步，缩小尺寸。</span></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">将图片缩小到8x8的尺寸，总共64个像素。这一步的作用是去除图片的细节，只保留结构、明暗等基本信息，摒弃不同尺寸、比例带来的图片差异。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201107/bg2011072107.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" />&nbsp;<img src="http://image.beekka.com/blog/201107/bg2011072107.png" width="64" height="64" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><span style="font-weight: 800;">第二步，简化色彩。</span></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">将缩小后的图片，转为64级灰度。也就是说，所有像素点总共只有64种颜色。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><span style="font-weight: 800;">第三步，计算平均值。</span></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">计算所有64个像素的灰度平均值。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><span style="font-weight: 800;">第四步，比较像素的灰度。</span></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">将每个像素的灰度，与平均值进行比较。大于或等于平均值，记为1；小于平均值，记为0。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><span style="font-weight: 800;">第五步，计算哈希值。</span></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">将上一步的比较结果，组合在一起，就构成了一个64位的整数，这就是这张图片的指纹。组合的次序并不重要，只要保证所有图片都采用同样次序就行了。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201107/bg2011072109.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" />&nbsp;=&nbsp;<img src="http://image.beekka.com/blog/201107/bg2011072109.png" width="64" height="64" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" />&nbsp;= 8f373714acfcf4d0</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">得到指纹以后，就可以对比不同的图片，看看64位中有多少位是不一样的。在理论上，这等同于计算<a href="http://zh.wikipedia.org/wiki/%E6%B1%89%E6%98%8E%E8%B7%9D%E7%A6%BB" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">"汉明距离"</a>（Hamming distance）。如果不相同的数据位不超过5，就说明两张图片很相似；如果大于10，就说明这是两张不同的图片。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">具体的代码实现，可以参见<a href="http://www.reddit.com/r/programming/comments/hql8b/looks_like_it_for_the_last_few_months_i_have_had/c1xkcdd" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">Wote</a>用python语言写的<a href="http://www.ruanyifeng.com/blog/2011/07/imgHash.txt" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">imgHash.py</a>。代码很短，只有53行。使用的时候，第一个参数是基准图片，第二个参数是用来比较的其他图片所在的目录，返回结果是两张图片之间不相同的数据位数量（汉明距离）。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">这种算法的优点是简单快速，不受图片大小缩放的影响，缺点是图片的内容不能变更。如果在图片上加几个文字，它就认不出来了。所以，它的最佳用途是根据缩略图，找出原图。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">实际应用中，往往采用更强大的<a href="http://www.phash.org/" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">pHash</a>算法和<a href="http://en.wikipedia.org/wiki/Scale-invariant_feature_transform" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">SIFT</a>算法，它们能够识别图片的变形。只要变形程度不超过25%，它们就能匹配原图。这些算法虽然更复杂，但是原理与上面的简便算法是一样的，就是先将图片转化成Hash字符串，然后再进行比较。</p><img src ="http://www.cppblog.com/noflybird/aggbug/206076.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/noflybird/" target="_blank">不会飞的鸟</a> 2014-03-06 21:42 <a href="http://www.cppblog.com/noflybird/archive/2014/03/06/206076.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>[转]TF-IDF与余弦相似性的应用（三）：自动摘要</title><link>http://www.cppblog.com/noflybird/archive/2014/03/06/206075.html</link><dc:creator>不会飞的鸟</dc:creator><author>不会飞的鸟</author><pubDate>Thu, 06 Mar 2014 13:37:00 GMT</pubDate><guid>http://www.cppblog.com/noflybird/archive/2014/03/06/206075.html</guid><wfw:comment>http://www.cppblog.com/noflybird/comments/206075.html</wfw:comment><comments>http://www.cppblog.com/noflybird/archive/2014/03/06/206075.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/noflybird/comments/commentRss/206075.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/noflybird/services/trackbacks/206075.html</trackback:ping><description><![CDATA[<p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">有时候，很简单的数学方法，就可以完成很复杂的任务。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">这个系列的前两部分就是很好的例子。仅仅依靠统计词频，就能找出<a href="http://www.ruanyifeng.com/blog/2013/03/tf-idf.html" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">关键词</a>和<a href="http://www.ruanyifeng.com/blog/2013/03/cosine_similarity.html" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">相似文章</a>。虽然它们算不上效果最好的方法，但肯定是最简便易行的方法。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">今天，依然继续这个主题。讨论如何通过词频，对文章进行<a href="http://en.wikipedia.org/wiki/Automatic_summarization" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">自动摘要</a>（Automatic summarization）。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201303/bg2013032501.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">如果能从3000字的文章，提炼出150字的摘要，就可以为读者节省大量阅读时间。由人完成的摘要叫"人工摘要"，由机器完成的就叫"自动摘要"。许多网站都需要它，比如论文网站、新闻网站、搜索引擎等等。2007年，美国学者的论文<a href="http://www.cs.cmu.edu/~nasmith/LS2/das-martins.07.pdf" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">《A Survey on Automatic Text Summarization》</a>（Dipanjan Das, Andre F.T. Martins, 2007）总结了目前的自动摘要算法。其中，很重要的一种就是词频统计。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">这种方法最早出自1958年的IBM公司科学家<a href="http://en.wikipedia.org/wiki/Hans_Peter_Luhn" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">H.P. Luhn</a>的论文<a href="http://www.di.ubi.pt/~jpaulo/competence/general/(1958)Luhn.pdf" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">《The Automatic Creation of Literature Abstracts》</a>。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">Luhn博士认为，文章的信息都包含在句子中，有些句子包含的信息多，有些句子包含的信息少。"自动摘要"就是要找出那些包含信息最多的句子。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">句子的信息量用"关键词"来衡量。如果包含的关键词越多，就说明这个句子越重要。Luhn提出用"簇"（cluster）表示关键词的聚集。所谓"簇"就是包含多个关键词的句子片段。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201303/bg2013032502.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">上图就是Luhn原始论文的插图，被框起来的部分就是一个"簇"。只要关键词之间的距离小于"门槛值"，它们就被认为处于同一个簇之中。Luhn建议的门槛值是4或5。也就是说，如果两个关键词之间有5个以上的其他词，就可以把这两个关键词分在两个簇。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">下一步，对于每个簇，都计算它的重要性分值。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;"><img src="http://image.beekka.com/blog/201303/bg2013032503.png" style="margin: 0px; padding: 0px; list-style-type: none; border: 0.7em solid #e0dfcc; border-top-left-radius: 0.7em; border-top-right-radius: 0.7em; border-bottom-right-radius: 0.7em; border-bottom-left-radius: 0.7em;"  alt="" /></p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">以前图为例，其中的簇一共有7个词，其中4个是关键词。因此，它的重要性分值等于 ( 4 x 4 ) / 7 = 2.3。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">然后，找出包含分值最高的簇的句子（比如5句），把它们合在一起，就构成了这篇文章的自动摘要。具体实现可以参见<a href="http://www.amazon.com/Mining-Social-Web-Analyzing-Facebook/dp/1449388345" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">《Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Sites》</a>（O'Reilly, 2011）一书的第8章，python代码见<a href="https://github.com/ptwobrussell/Mining-the-Social-Web/blob/master/python_code/blogs_and_nlp__summarize.py" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">github</a>。</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">Luhn的这种算法后来被简化，不再区分"簇"，只考虑句子包含的关键词。下面就是一个例子（采用伪码表示），只考虑关键词首先出现的句子。</p><blockquote style="margin: 2em; padding: 1em; list-style-type: none; border-width: 1em; border-color: #e0dfcc; color: #111111; background-color: #f5f2f0; font-family: Consolas, Monaco, 'Andale Mono', monospace; border-top-left-radius: 20px; border-top-right-radius: 20px; border-bottom-right-radius: 20px; border-bottom-left-radius: 20px; text-shadow: white 0px 1px; font-size: 12px; line-height: 21.59375px; word-spacing: 2px;"><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　Summarizer(originalText, maxSummarySize):</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　　　// 计算原始文本的词频，生成一个数组，比如[(10,'the'), (3,'language'), (8,'code')...]<br />　　　　wordFrequences = getWordCounts(originalText)</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　　　// 过滤掉停用词，数组变成[(3, 'language'), (8, 'code')...]<br />　　　　contentWordFrequences = filtStopWords(wordFrequences)</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　　　// 按照词频进行排序，数组变成['code', 'language'...]<br />　　　　contentWordsSortbyFreq = sortByFreqThenDropFreq(contentWordFrequences)</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　　　// 将文章分成句子<br />　　　　sentences = getSentences(originalText)</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　　　// 选择关键词首先出现的句子<br />　　　　setSummarySentences = {}<br />　　　　foreach word in contentWordsSortbyFreq:<br />　　　　　　firstMatchingSentence = search(sentences, word)<br />　　　　　　setSummarySentences.add(firstMatchingSentence)<br />　　　　　　if setSummarySentences.size() = maxSummarySize:<br />　　　　　　　　break</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　　　// 将选中的句子按照出现顺序，组成摘要<br />　　　　summary = ""<br />　　　　foreach sentence in sentences:<br />　　　　　　if sentence in setSummarySentences:<br />　　　　　　　　summary = summary + " " + sentence</p><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; font-size: 1.6em; line-height: 28px;">　　　　return summary</p></blockquote><p style="margin: 1em 0px 0px 0.8em; padding: 0px; list-style-type: none; border: none; color: #111111; font-size: 1.6em; line-height: 28px; font-family: Georgia, serif; word-spacing: 2px; background-color: #f5f5d5;">类似的算法已经被写成了工具，比如基于Java的<a href="http://classifier4j.sourceforge.net/" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">Classifier4J</a>库的<a href="http://classifier4j.sourceforge.net/subprojects/core/apidocs/net/sf/classifier4J/summariser/SimpleSummariser.html" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">SimpleSummariser</a>模块、基于C语言的<a href="http://libots.sourceforge.net/" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">OTS</a>库、以及基于classifier4J的<a href="http://nclassifier.sourceforge.net/" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">C#实现</a>和<a href="https://groups.google.com/forum/?fromgroups#!topic/nltk-dev/qV9e5TsCBHg" target="_blank" style="margin: 0px; padding: 0px; list-style-type: none; border: none; color: #112233;">python实现</a>。</p><img src ="http://www.cppblog.com/noflybird/aggbug/206075.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/noflybird/" target="_blank">不会飞的鸟</a> 2014-03-06 21:37 <a href="http://www.cppblog.com/noflybird/archive/2014/03/06/206075.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>