﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>C++博客-学着站在巨人的肩膀上</title><link>http://www.cppblog.com/jrckkyy/</link><description>金融数学,InformationSearch,Compiler,OS,</description><language>zh-cn</language><lastBuildDate>Thu, 09 Apr 2026 05:21:37 GMT</lastBuildDate><pubDate>Thu, 09 Apr 2026 05:21:37 GMT</pubDate><ttl>60</ttl><item><title>windows下配置python ，django，mysql，memcahe开发环境</title><link>http://www.cppblog.com/jrckkyy/archive/2010/03/15/109756.html</link><dc:creator>学者站在巨人的肩膀上</dc:creator><author>学者站在巨人的肩膀上</author><pubDate>Mon, 15 Mar 2010 11:25:00 GMT</pubDate><guid>http://www.cppblog.com/jrckkyy/archive/2010/03/15/109756.html</guid><wfw:comment>http://www.cppblog.com/jrckkyy/comments/109756.html</wfw:comment><comments>http://www.cppblog.com/jrckkyy/archive/2010/03/15/109756.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/jrckkyy/comments/commentRss/109756.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/jrckkyy/services/trackbacks/109756.html</trackback:ping><description><![CDATA[<p>这里做个记录，<a href="http://writeblog.csdn.net/jrckkyy/blog/item/59dbe1438746441a9213c6c0.html"><u><font color=#0000ff>[分布式跨平台监控系统</font></u></a>]肯定离不开的要配置windows下的环境，linux的一键安装程序有apt，zyyper，yum等傻瓜工具，windows下有时候还成了问题。</p>
<p>1，将windows版python2.5装入d：/python25，将d：/python25添加入环境变量path</p>
<p>2，下载下django，在django目录里运行，python setup.py install ，会自动查找path环境变量，将django的库放入d：/python25/lib</p>
<p>3，装一个setuptools-0.6c11.win32-py2.5.rar，会在 D:\Python25\Scripts 中出现 easy_install.exe</p>
<p>4，装mysql api和memcahe api，在D:\Python25\Scripts 目录下运行 easy_install.exe install mysqldb，或easy_install.exe install memcahe提示要去 <a href="http://pypi.python.org/simple/"><u><font color=#0000ff>http://pypi.python.org/simple/</font></u></a> 找具体下载安装的包，打开网址找到相应的url然后 easy_install.exe install url即可</p>
<p>5，如果没有自动安装程序 setuptools-0.6c11.win32-py2.5 或装不了，可以直接复制以前 D:\Python25\Lib\site-packages下的 MySQLdb 文件夹到 现在的D:\Python25\Lib\site-packages的目录下，只要版本对得上就可以正常运行，完全绿色的。</p>
<p>6，如果url安装不了，以前也没有用过，可以下载安装包，例如memcache的api安装可以去这里<a href="ftp://ftp.tummy.com/pub/python-memcached/old-releases/python-memcached-1.45.tar.gz"><u><font color=#0000ff>ftp://ftp.tummy.com/pub/python-memcached/old-releases/python-memcached-1.45.tar.gz</font></u></a> 下载</p>
<p>然后解压进入目录执行python setup.py install</p>
<p>7，建立新的django项目或在以往的工程目录下运行 python manage.py syncdb （这里只会检测库中的表，没有表明就建立，如果有表明，结构被改变了是不会做任何修改的），同步数据库表结构，事先要在mysql里建立setting.py里设置的数据库。</p>
<img src ="http://www.cppblog.com/jrckkyy/aggbug/109756.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/jrckkyy/" target="_blank">学者站在巨人的肩膀上</a> 2010-03-15 19:25 <a href="http://www.cppblog.com/jrckkyy/archive/2010/03/15/109756.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>[分布式跨平台监控系统]linux，windows下一句话发邮件-python脚本应用 </title><link>http://www.cppblog.com/jrckkyy/archive/2010/03/15/109755.html</link><dc:creator>学者站在巨人的肩膀上</dc:creator><author>学者站在巨人的肩膀上</author><pubDate>Mon, 15 Mar 2010 11:24:00 GMT</pubDate><guid>http://www.cppblog.com/jrckkyy/archive/2010/03/15/109755.html</guid><wfw:comment>http://www.cppblog.com/jrckkyy/comments/109755.html</wfw:comment><comments>http://www.cppblog.com/jrckkyy/archive/2010/03/15/109755.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/jrckkyy/comments/commentRss/109755.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/jrckkyy/services/trackbacks/109755.html</trackback:ping><description><![CDATA[<p>前一阵花了点时间学习python，近段时间完成了一个监控服务器基本信息的项目，都是为了满足大家监控的欲望，特殊日志并报警的分布式系统，单台服务器采集粒度为1次/1分钟，一天大约1440条，目前监控了20多台服务器，一天大约31680条日志，现在单点监控中心服务器在性能上还绰绰有余，有更多的服务器来测试就好了，估计可以支持到100台以上服务器监控的级别。</p>
<p>现在遇到一个需求是发现报警时实时发送消息给相关人员，由于公司短信网关只买了上海电信用户没有上海电信的号码，汗一个，只好通过发邮件来实施。</p>
<p>支持发送GB18030编码的文本内容，任意编码附件，可以做出适当修改支持群发。</p>
<p>&nbsp;
<div class=dp-highlighter>
<div class=bar>
<div class=columns>
<div>&#183;&#183;&#183;&#183;&#183;&#183;&#183;&#183;&#183;10&#183;&#183;&#183;&#183;&#183;&#183;&#183;&#183;20&#183;&#183;&#183;&#183;&#183;&#183;&#183;&#183;30&#183;&#183;&#183;&#183;&#183;&#183;&#183;&#183;40&#183;&#183;&#183;&#183;&#183;&#183;&#183;&#183;50&#183;&#183;&#183;&#183;&#183;&#183;&#183;&#183;60&#183;&#183;&#183;&#183;&#183;&#183;&#183;&#183;70&#183;&#183;&#183;&#183;&#183;&#183;&#183;&#183;80&#183;&#183;&#183;&#183;&#183;&#183;&#183;&#183;90&#183;&#183;&#183;&#183;&#183;&#183;&#183;&#183;100&#183;&#183;&#183;&#183;&#183;&#183;&#183;110&#183;&#183;&#183;&#183;&#183;&#183;&#183;120&#183;&#183;&#183;&#183;&#183;&#183;&#183;130&#183;&#183;&#183;&#183;&#183;&#183;&#183;140&#183;&#183;&#183;&#183;&#183;&#183;&#183;150</div>
</div>
</div>
<ol class=dp-py>
    <li class=alt><span><span class=comment>#coding=utf-8 </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span></span><span class=comment>#!/usr/lib/python2.5/bin/python </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span></span><span class=keyword>import</span><span>&nbsp;os &nbsp;&nbsp;</span></span></li>
    <li class=""><span></span><span class=keyword>import</span><span>&nbsp;sys &nbsp;&nbsp;</span></span></li>
    <li class=alt><span></span><span class=keyword>from</span><span>&nbsp;smtplib&nbsp;</span><span class=keyword>import</span><span>&nbsp;SMTP &nbsp;&nbsp;</span></span></li>
    <li class=""><span></span><span class=keyword>from</span><span>&nbsp;email.MIMEMultipart&nbsp;</span><span class=keyword>import</span><span>&nbsp;MIMEMultipart &nbsp;&nbsp;</span></span></li>
    <li class=alt><span></span><span class=keyword>from</span><span>&nbsp;email.mime.application&nbsp;</span><span class=keyword>import</span><span>&nbsp;MIMEApplication &nbsp;&nbsp;</span></span></li>
    <li class=""><span></span><span class=keyword>from</span><span>&nbsp;email.MIMEText&nbsp;</span><span class=keyword>import</span><span>&nbsp;MIMEText &nbsp;&nbsp;</span></span></li>
    <li class=alt><span></span><span class=keyword>from</span><span>&nbsp;email.MIMEBase&nbsp;</span><span class=keyword>import</span><span>&nbsp;MIMEBase &nbsp;&nbsp;</span></span></li>
    <li class=""><span></span><span class=keyword>from</span><span>&nbsp;email&nbsp;</span><span class=keyword>import</span><span>&nbsp;Utils,Encoders &nbsp;&nbsp;</span></span></li>
    <li class=alt><span></span><span class=keyword>import</span><span>&nbsp;mimetypes &nbsp;&nbsp;</span></span></li>
    <li class=""><span></span><span class=keyword>import</span><span>&nbsp;time &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;</span></li>
    <li class=""><span>STMP_SERVER&nbsp;=&nbsp;</span><span class=string>"mail.&#215;&#215;&#215;.com"</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>STMP_PORT&nbsp;=&nbsp;</span><span class=string>"25"</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>USERNAME&nbsp;=&nbsp;</span><span class=string>"&#215;&#215;&#215;@&#215;&#215;&#215;.com"</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>USERPASSWORD&nbsp;=&nbsp;</span><span class=string>"&#215;&#215;&#215;"</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>FROM&nbsp;=&nbsp;</span><span class=string>"MonitorCenterWarning@&#215;&#215;&#215;.com"</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>TO&nbsp;=&nbsp;</span><span class=string>"&#215;&#215;&#215;@gmail.com"</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;</span></li>
    <li class=alt><span></span><span class=keyword>def</span><span>&nbsp;sendFildByMail(config): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>print</span><span>&nbsp;</span><span class=string>'Preparing...'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;message&nbsp;=&nbsp;MIMEMultipart(&nbsp;) &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;message[</span><span class=string>'from'</span><span>]&nbsp;=&nbsp;config[</span><span class=string>'from'</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;message[</span><span class=string>'to'</span><span>]&nbsp;=&nbsp;config[</span><span class=string>'to'</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;message[</span><span class=string>'Reply-To'</span><span>]&nbsp;=&nbsp;config[</span><span class=string>'from'</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;message[</span><span class=string>'Subject'</span><span>]&nbsp;=&nbsp;config[</span><span class=string>'subject'</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;message[</span><span class=string>'Date'</span><span>]&nbsp;=&nbsp;time.ctime(time.time()) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;message[</span><span class=string>'X-Priority'</span><span>]&nbsp;=&nbsp;&nbsp;</span><span class=string>'3'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;message[</span><span class=string>'X-MSMail-Priority'</span><span>]&nbsp;=&nbsp;&nbsp;</span><span class=string>'Normal'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;message[</span><span class=string>'X-Mailer'</span><span>]&nbsp;=&nbsp;&nbsp;</span><span class=string>'Microsoft&nbsp;Outlook&nbsp;Express&nbsp;6.00.2900.2180'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;message[</span><span class=string>'X-MimeOLE'</span><span>]&nbsp;=&nbsp;&nbsp;</span><span class=string>'Produced&nbsp;By&nbsp;Microsoft&nbsp;MimeOLE&nbsp;V6.00.2900.2180'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;</span><span class=string>'file'</span><span>&nbsp;</span><span class=keyword>in</span><span>&nbsp;config: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=comment>#添加附件 </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f=open(config[</span><span class=string>'file'</span><span>],&nbsp;</span><span class=string>'rb'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file&nbsp;=&nbsp;MIMEApplication(f.read()) &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f.close() &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file.add_header(</span><span class=string>'Content-Disposition'</span><span>,&nbsp;</span><span class=string>'attachment'</span><span>,&nbsp;filename=&nbsp;os.path.basename(config[</span><span class=string>'file'</span><span>])) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;message.attach(file) &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;</span><span class=string>'content'</span><span>&nbsp;</span><span class=keyword>in</span><span>&nbsp;config: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=comment>#添加文本内容 </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f=open(config[</span><span class=string>'content'</span><span>],&nbsp;</span><span class=string>'rb'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;f.seek(</span><span class=number>0</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;content&nbsp;=&nbsp;f.read() &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;body&nbsp;=&nbsp;MIMEText(content,&nbsp;</span><span class=string>'base64'</span><span>,&nbsp;</span><span class=string>'gb2312'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;message.attach(body) &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>print</span><span>&nbsp;</span><span class=string>'OKay'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>print</span><span>&nbsp;</span><span class=string>'Logging...'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;smtp&nbsp;=&nbsp;SMTP(config[</span><span class=string>'server'</span><span>],&nbsp;config[</span><span class=string>'port'</span><span>]) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=comment>#如果SMTP服务器发邮件时不需要验证登录则对下面这行加上注释 </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;smtp.login(config[</span><span class=string>'username'</span><span>],&nbsp;config[</span><span class=string>'password'</span><span>]) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>print</span><span>&nbsp;</span><span class=string>'OK'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>print</span><span>&nbsp;</span><span class=string>'Sending...'</span><span>, &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;smtp.sendmail&nbsp;(config[</span><span class=string>'from'</span><span>],&nbsp;[config[</span><span class=string>'from'</span><span>],&nbsp;config[</span><span class=string>'to'</span><span>]],&nbsp;message.as_string()) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>print</span><span>&nbsp;</span><span class=string>'OK'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;smtp.close() &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;time.sleep(</span><span class=number>1</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;</span></li>
    <li class=alt><span></span><span class=keyword>if</span><span>&nbsp;__name__&nbsp;==&nbsp;</span><span class=string>"__main__"</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;len(sys.argv)&nbsp;&lt;&nbsp;</span><span class=number>2</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>print</span><span>&nbsp;</span><span class=string>'Usage:&nbsp;python&nbsp;%s&nbsp;contentfilename'</span><span>&nbsp;%&nbsp;os.path.basename(sys.argv[</span><span class=number>0</span><span>]) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>print</span><span>&nbsp;</span><span class=string>'OR&nbsp;Usage:&nbsp;python&nbsp;%s&nbsp;contentfilename&nbsp;attachfilename'</span><span>&nbsp;%&nbsp;os.path.basename(sys.argv[</span><span class=number>0</span><span>]) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;wait=raw_input(</span><span class=string>"quit."</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sys.exit(-</span><span class=number>1</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>elif</span><span>&nbsp;len(sys.argv)&nbsp;==&nbsp;</span><span class=number>2</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sendFildByMail({ &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'from'</span><span>:&nbsp;FROM, &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'to'</span><span>:&nbsp;TO, &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'subject'</span><span>:&nbsp;</span><span class=string>'[MonitorCenter]Send&nbsp;Msg&nbsp;%s'</span><span>&nbsp;%&nbsp;sys.argv[</span><span class=number>1</span><span>], &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'content'</span><span>:&nbsp;sys.argv[</span><span class=number>1</span><span>], &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'server'</span><span>:&nbsp;STMP_SERVER, &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'port'</span><span>:&nbsp;STMP_PORT, &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'username'</span><span>:&nbsp;USERNAME, &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'password'</span><span>:&nbsp;USERPASSWORD}) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>elif</span><span>&nbsp;len(sys.argv)&nbsp;==&nbsp;</span><span class=number>3</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sendFildByMail({ &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'from'</span><span>:&nbsp;FROM, &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'to'</span><span>:&nbsp;TO, &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'subject'</span><span>:&nbsp;</span><span class=string>'[MonitorCenter]Send&nbsp;Msg&nbsp;and&nbsp;File&nbsp;%s&nbsp;%s'</span><span>&nbsp;%&nbsp;(sys.argv[</span><span class=number>1</span><span>],&nbsp;sys.argv[</span><span class=number>2</span><span>]), &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'content'</span><span>:&nbsp;sys.argv[</span><span class=number>1</span><span>], &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'file'</span><span>:&nbsp;sys.argv[</span><span class=number>2</span><span>], &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'server'</span><span>:&nbsp;STMP_SERVER, &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'port'</span><span>:&nbsp;STMP_PORT, &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'username'</span><span>:&nbsp;USERNAME, &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=string>'password'</span><span>:&nbsp;USERPASSWORD}) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;wait=raw_input(</span><span class=string>"end."</span><span>)&nbsp;&nbsp;</span></span></li>
</ol>
</div>
<textarea class=python:nocontrols:showcolumns style="DISPLAY: none" name=code rows=15 cols=50>#coding=utf-8
#!/usr/lib/python2.5/bin/python
import os
import sys
from smtplib import SMTP
from email.MIMEMultipart import MIMEMultipart
from email.mime.application import MIMEApplication
from email.MIMEText import MIMEText
from email.MIMEBase import MIMEBase
from email import Utils,Encoders
import mimetypes
import time
STMP_SERVER = "mail.&#215;&#215;&#215;.com"
STMP_PORT = "25"
USERNAME = "&#215;&#215;&#215;@&#215;&#215;&#215;.com"
USERPASSWORD = "&#215;&#215;&#215;"
FROM = "MonitorCenterWarning@&#215;&#215;&#215;.com"
TO = "&#215;&#215;&#215;@gmail.com"
def sendFildByMail(config):
print 'Preparing...'
message = MIMEMultipart( )
message['from'] = config['from']
message['to'] = config['to']
message['Reply-To'] = config['from']
message['Subject'] = config['subject']
message['Date'] = time.ctime(time.time())
message['X-Priority'] =  '3'
message['X-MSMail-Priority'] =  'Normal'
message['X-Mailer'] =  'Microsoft Outlook Express 6.00.2900.2180'
message['X-MimeOLE'] =  'Produced By Microsoft MimeOLE V6.00.2900.2180'
if 'file' in config:
#添加附件
f=open(config['file'], 'rb')
file = MIMEApplication(f.read())
f.close()
file.add_header('Content-Disposition', 'attachment', filename= os.path.basename(config['file']))
message.attach(file)
if 'content' in config:
#添加文本内容
f=open(config['content'], 'rb')
f.seek(0)
content = f.read()
body = MIMEText(content, 'base64', 'gb2312')
message.attach(body)
print 'OKay'
print 'Logging...'
smtp = SMTP(config['server'], config['port'])
#如果SMTP服务器发邮件时不需要验证登录则对下面这行加上注释
smtp.login(config['username'], config['password'])
print 'OK'
print 'Sending...',
smtp.sendmail (config['from'], [config['from'], config['to']], message.as_string())
print 'OK'
smtp.close()
time.sleep(1)
if __name__ == "__main__":
if len(sys.argv) &lt; 2:
print 'Usage: python %s contentfilename' % os.path.basename(sys.argv[0])
print 'OR Usage: python %s contentfilename attachfilename' % os.path.basename(sys.argv[0])
wait=raw_input("quit.")
sys.exit(-1)
elif len(sys.argv) == 2:
sendFildByMail({
'from': FROM,
'to': TO,
'subject': '[MonitorCenter]Send Msg %s' % sys.argv[1],
'content': sys.argv[1],
'server': STMP_SERVER,
'port': STMP_PORT,
'username': USERNAME,
'password': USERPASSWORD})
elif len(sys.argv) == 3:
sendFildByMail({
'from': FROM,
'to': TO,
'subject': '[MonitorCenter]Send Msg and File %s %s' % (sys.argv[1], sys.argv[2]),
'content': sys.argv[1],
'file': sys.argv[2],
'server': STMP_SERVER,
'port': STMP_PORT,
'username': USERNAME,
'password': USERPASSWORD})
wait=raw_input("end.")
</textarea>
<p>&#160;</p>
<p>windows xp下：</p>
<p><img title=例子 height=166 alt=例子 src="http://hi.csdn.net/attachment/201003/12/6723_126837066066C3.png" width=657></p>
<p>&nbsp;linux ubuntu，suse下：</p>
<p><img title=1 height=128 alt=1 src="http://hi.csdn.net/attachment/201003/12/6723_1268371549tzVm.png" width=634></p>
<p>收到的结果：</p>
<p><img title=2 height=66 alt=2 src="http://hi.csdn.net/attachment/201003/12/6723_1268371596HISQ.png" width=1237></p>
<img src ="http://www.cppblog.com/jrckkyy/aggbug/109755.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/jrckkyy/" target="_blank">学者站在巨人的肩膀上</a> 2010-03-15 19:24 <a href="http://www.cppblog.com/jrckkyy/archive/2010/03/15/109755.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>[分布式跨平台监控系统]linux下监控网络流量和网速-python脚本应用</title><link>http://www.cppblog.com/jrckkyy/archive/2010/03/15/109754.html</link><dc:creator>学者站在巨人的肩膀上</dc:creator><author>学者站在巨人的肩膀上</author><pubDate>Mon, 15 Mar 2010 11:22:00 GMT</pubDate><guid>http://www.cppblog.com/jrckkyy/archive/2010/03/15/109754.html</guid><wfw:comment>http://www.cppblog.com/jrckkyy/comments/109754.html</wfw:comment><comments>http://www.cppblog.com/jrckkyy/archive/2010/03/15/109754.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/jrckkyy/comments/commentRss/109754.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/jrckkyy/services/trackbacks/109754.html</trackback:ping><description><![CDATA[<p>由于上证所，深交所level1，level2金融数据服务器在上午9：00开始到11：30和下午13：00开始到15：30一共大约5个小时的时间内流量比较大所以被监控服务器的网络流速算是一个被监控的重要指标。可以通过累加一段时间内各个网卡的上行，下行流量除以这个时间间隔计算出这段时间内的平均网速，我现在的采集频率是1分钟采集一次，在实际开盘期间运行过程中得到的网速监控信息用还是比较准确的，都保持在5M/S左右的速度，有时候在平时非服务期看见某台服务器的内网网卡网速达到5M/S ，果然就是有人在大手笔传输。</p>
<p>独立的监控脚本是返回一个列表嵌套元组的数据结构，最后再汇总成一个完整的XML数据岛，为了调试方便脚本的每一个中间结果都导出到一个临时文本中。</p>
<p>运行以下脚本要确定你的linux装了ethtool工具，在ubuntu2.6.27-7-server，ubuntu22.6.27.19-5-default，suse 2.6.27.19-5-default 测试通过。</p>
<p>代码：</p>
<p>&nbsp;
<div class=dp-highlighter>
<div class=bar>
<div class=tools><a onclick="dp.sh.Toolbar.Command('ViewSource',this);return false;" href="http://blog.csdn.net/jrckkyy/archive/2010/03/13/5376462.aspx#"><u><font color=#800080>view plain</font></u></a><a onclick="dp.sh.Toolbar.Command('CopyToClipboard',this);return false;" href="http://blog.csdn.net/jrckkyy/archive/2010/03/13/5376462.aspx#"><u><font color=#800080>copy to clipboard</font></u></a><a onclick="dp.sh.Toolbar.Command('PrintSource',this);return false;" href="http://blog.csdn.net/jrckkyy/archive/2010/03/13/5376462.aspx#"><u><font color=#800080>print</font></u></a><a onclick="dp.sh.Toolbar.Command('About',this);return false;" href="http://blog.csdn.net/jrckkyy/archive/2010/03/13/5376462.aspx#"><u><font color=#800080>?</font></u></a></div>
</div>
<ol class=dp-py>
    <li class=alt><span><span class=comment>#coding=utf-8 </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span></span><span class=comment>#!/usr/bin/python </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span></span><span class=keyword>import</span><span>&nbsp;re &nbsp;&nbsp;</span></span></li>
    <li class=""><span></span><span class=keyword>import</span><span>&nbsp;os &nbsp;&nbsp;</span></span></li>
    <li class=alt><span></span><span class=keyword>import</span><span>&nbsp;time &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;</span></li>
    <li class=alt><span></span><span class=keyword>import</span><span>&nbsp;utils &nbsp;&nbsp;</span></span></li>
    <li class=""><span></span><span class=keyword>def</span><span>&nbsp;sortedDictValues3(adict): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;keys&nbsp;=&nbsp;adict.keys() &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;keys.sort() &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>return</span><span>&nbsp;map(adict.get,&nbsp;keys) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;</span></li>
    <li class=alt><span></span><span class=keyword>def</span><span>&nbsp;run(): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;utils.isLinux()&nbsp;==&nbsp;</span><span class=special>False</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>return</span><span>&nbsp;[(</span><span class=string>'ifconfig_collect&nbsp;os&nbsp;type&nbsp;error'</span><span>,</span><span class=string>'this&nbsp;is&nbsp;windows'</span><span>)] &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=comment>#not&nbsp;first&nbsp;run </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;os.path.isfile(</span><span class=string>'./oldifconfig'</span><span>): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold&nbsp;=&nbsp;open(</span><span class=string>'./oldifconfig'</span><span>,&nbsp;</span><span class=string>'r'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold.seek(</span><span class=number>0</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=comment>#读入上次记录的临时流量数据文件，和时间戳 </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(oldtime,&nbsp;fileoldcontent)&nbsp;=&nbsp;fileold.read().split(</span><span class=string>'#'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold.close; &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard&nbsp;=&nbsp;{} &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempstr&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;strline&nbsp;</span><span class=keyword>in</span><span>&nbsp;fileoldcontent.split(</span><span class=string>'\n'</span><span>): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reobj&nbsp;=&nbsp;re.compile(</span><span class=string>'^lo*.'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;reobj.search(strline): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>break</span><span>; &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reobj&nbsp;=&nbsp;re.compile(</span><span class=string>'^eth*.'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;reobj.search(strline): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;strline.split()[</span><span class=number>0</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempstr&nbsp;=&nbsp;tempstr&nbsp;+&nbsp;strline&nbsp;+&nbsp;</span><span class=string>'\n'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;tempstr &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;RXold&nbsp;=&nbsp;{} &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TXold&nbsp;=&nbsp;{} &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;key,value&nbsp;</span><span class=keyword>in</span><span>&nbsp;netcard.items(): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempsplit&nbsp;=&nbsp;value.split(</span><span class=string>'\n'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;item&nbsp;</span><span class=keyword>in</span><span>&nbsp;tempsplit: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;item&nbsp;=&nbsp;item&nbsp;+&nbsp;</span><span class=string>'&lt;br&gt;'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;netcard[key]&nbsp;+&nbsp;item &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempcount&nbsp;=&nbsp;</span><span class=number>1</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;match&nbsp;</span><span class=keyword>in</span><span>&nbsp;re.finditer(</span><span class=string>"(bytes:)(.*?)(&nbsp;\()"</span><span>,&nbsp;item): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;tempcount&nbsp;==&nbsp;</span><span class=number>1</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;RXold[key]&nbsp;=&nbsp;match.group(</span><span class=number>2</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempcount&nbsp;=&nbsp;tempcount&nbsp;+&nbsp;</span><span class=number>1</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>elif</span><span>&nbsp;tempcount&nbsp;==&nbsp;</span><span class=number>2</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TXold[key]&nbsp;=&nbsp;match.group(</span><span class=number>2</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;netcard[key]&nbsp;+&nbsp;</span><span class=string>'net&nbsp;io&nbsp;percent(bytes/s):&nbsp;0&nbsp;&lt;br&gt;'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=comment>#记录当前网卡信息到临时文件中 </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;os.system(</span><span class=string>'ifconfig&nbsp;&gt;&nbsp;ifconfigtemp'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file&nbsp;=&nbsp;open(</span><span class=string>'./ifconfigtemp'</span><span>,</span><span class=string>'r'</span><span>); &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold&nbsp;=&nbsp;open(</span><span class=string>'./oldifconfig'</span><span>,&nbsp;</span><span class=string>'w'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;temptimestr&nbsp;=&nbsp;str(int(time.time())); &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold.write(temptimestr) &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold.write(</span><span class=string>'#'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file.seek(</span><span class=number>0</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold.write(file.read()) &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold.close() &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;returnkeys&nbsp;=&nbsp;[] &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;returnvalues&nbsp;=&nbsp;[] &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard&nbsp;=&nbsp;{} &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempcountcard&nbsp;=&nbsp;</span><span class=number>0</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file.seek(</span><span class=number>0</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;strline&nbsp;</span><span class=keyword>in</span><span>&nbsp;file.readlines(): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reobj&nbsp;=&nbsp;re.compile(</span><span class=string>'^lo*.'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;reobj.search(strline): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>break</span><span>; &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reobj&nbsp;=&nbsp;re.compile(</span><span class=string>'^eth*.'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;reobj.search(strline): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;strline.split()[</span><span class=number>0</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;netcard[key]&nbsp;+&nbsp;strline &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard&nbsp;=&nbsp;{} &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file.seek(</span><span class=number>0</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;strline&nbsp;</span><span class=keyword>in</span><span>&nbsp;file.readlines(): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reobj&nbsp;=&nbsp;re.compile(</span><span class=string>'^lo*.'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;reobj.search(strline): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>break</span><span>; &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;re.search(</span><span class=string>"^eth"</span><span>,&nbsp;strline): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;templist&nbsp;=&nbsp;strline.split() &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;templist[</span><span class=number>0</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;templist[</span><span class=number>4</span><span>]&nbsp;+&nbsp;newnetcard[key]&nbsp;+&nbsp;</span><span class=string>'&nbsp;'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;re.search(</span><span class=string>"^&nbsp;*inet&nbsp;"</span><span>,&nbsp;strline): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;templist&nbsp;=&nbsp;strline.split() &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;templist[</span><span class=number>1</span><span>][</span><span class=number>5</span><span>:]&nbsp;+&nbsp;</span><span class=string>'&nbsp;'</span><span>&nbsp;+&nbsp;newnetcard[key]&nbsp;+&nbsp;</span><span class=string>'&nbsp;'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;key,value&nbsp;</span><span class=keyword>in</span><span>&nbsp;newnetcard.items(): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=comment>#记录每张网卡是否工作状态信息到临时文件 </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;os.system(</span><span class=string>'ethtool&nbsp;%s&nbsp;&gt;&nbsp;ethtooltemp'</span><span>%(key)) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file&nbsp;=&nbsp;open(</span><span class=string>'./ethtooltemp'</span><span>,</span><span class=string>'r'</span><span>); &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempethtooltemplist&nbsp;=&nbsp;file.read().split(</span><span class=string>'\n\t'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file.close &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;re.search(</span><span class=string>"yes"</span><span>,&nbsp;tempethtooltemplist[-</span><span class=number>1</span><span>]): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;templist&nbsp;=&nbsp;newnetcard[key].split() &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;templist[</span><span class=number>0</span><span>]&nbsp;+&nbsp;</span><span class=string>'&nbsp;runing!&nbsp;'</span><span>&nbsp;+&nbsp;templist[</span><span class=number>1</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>else</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;templist&nbsp;=&nbsp;newnetcard[key].split() &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;len(templist)&nbsp;&gt;&nbsp;</span><span class=number>1</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;templist[</span><span class=number>0</span><span>]&nbsp;+&nbsp;</span><span class=string>'&nbsp;stop!&nbsp;'</span><span>&nbsp;+&nbsp;templist[</span><span class=number>1</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>else</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;&nbsp;</span><span class=string>'stop!&nbsp;'</span><span>&nbsp;+&nbsp;templist[</span><span class=number>0</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file.close() &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;RX&nbsp;=&nbsp;{} &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TX&nbsp;=&nbsp;{} &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;key,value&nbsp;</span><span class=keyword>in</span><span>&nbsp;netcard.items(): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempsplit&nbsp;=&nbsp;value.split(</span><span class=string>'\n'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;item&nbsp;</span><span class=keyword>in</span><span>&nbsp;tempsplit: &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;item&nbsp;=&nbsp;item&nbsp;+&nbsp;</span><span class=string>'&lt;br&gt;'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;netcard[key]&nbsp;+&nbsp;item &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempcount&nbsp;=&nbsp;</span><span class=number>1</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;match&nbsp;</span><span class=keyword>in</span><span>&nbsp;re.finditer(</span><span class=string>"(bytes:)(.*?)(&nbsp;\()"</span><span>,&nbsp;item): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;tempcount&nbsp;==&nbsp;</span><span class=number>1</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;RX[key]&nbsp;=&nbsp;str(int(match.group(</span><span class=number>2</span><span>))&nbsp;-&nbsp;int(RXold[key])) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempcount&nbsp;=&nbsp;tempcount&nbsp;+&nbsp;</span><span class=number>1</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>elif</span><span>&nbsp;tempcount&nbsp;==&nbsp;</span><span class=number>2</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TX[key]&nbsp;=&nbsp;str(int(match.group(</span><span class=number>2</span><span>))&nbsp;-&nbsp;int(TXold[key])) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;divtime&nbsp;=&nbsp;float(int(time.time())&nbsp;-&nbsp;int(oldtime)) &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;divtime&nbsp;==&nbsp;</span><span class=number>0</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;rate&nbsp;=&nbsp;(float(TX[key])&nbsp;+&nbsp;float(RX[key])) &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>else</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;rate&nbsp;=&nbsp;(float(TX[key])&nbsp;+&nbsp;float(RX[key]))/(divtime) &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;rate&nbsp;==&nbsp;</span><span class=number>0</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;</span><span class=string>'0'</span><span>&nbsp;+&nbsp;</span><span class=string>'&nbsp;'</span><span>&nbsp;+&nbsp;newnetcard[key] &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>else</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;</span><span class=string>'%.2f'</span><span>%rate&nbsp;+&nbsp;</span><span class=string>'&nbsp;'</span><span>&nbsp;+&nbsp;newnetcard[key] &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>return</span><span>&nbsp;zip([</span><span class=string>'order'</span><span>],&nbsp;[</span><span class=string>'48'</span><span>])&nbsp;+&nbsp;newnetcard.items(); &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>else</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;os.system(</span><span class=string>'ifconfig&nbsp;&gt;&nbsp;ifconfigtemp'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file&nbsp;=&nbsp;open(</span><span class=string>'./ifconfigtemp'</span><span>,</span><span class=string>'r'</span><span>); &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold&nbsp;=&nbsp;open(</span><span class=string>'./oldifconfig'</span><span>,&nbsp;</span><span class=string>'w'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;temptimestr&nbsp;=&nbsp;str(int(time.time())); &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold.write(temptimestr) &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold.write(</span><span class=string>'#'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file.seek(</span><span class=number>0</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold.write(file.read()) &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;fileold.close() &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard&nbsp;=&nbsp;{} &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file.seek(</span><span class=number>0</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;strline&nbsp;</span><span class=keyword>in</span><span>&nbsp;file.readlines(): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reobj&nbsp;=&nbsp;re.compile(</span><span class=string>'^lo*.'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;reobj.search(strline): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>break</span><span>; &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reobj&nbsp;=&nbsp;re.compile(</span><span class=string>'^eth*.'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;reobj.search(strline): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;strline.split()[</span><span class=number>0</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;netcard[key]&nbsp;+&nbsp;strline &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;RX&nbsp;=&nbsp;{} &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TX&nbsp;=&nbsp;{} &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard&nbsp;=&nbsp;{} &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file.seek(</span><span class=number>0</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;strline&nbsp;</span><span class=keyword>in</span><span>&nbsp;file.readlines(): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;reobj&nbsp;=&nbsp;re.compile(</span><span class=string>'^lo*.'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;reobj.search(strline): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>break</span><span>; &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;re.search(</span><span class=string>"^eth"</span><span>,&nbsp;strline): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;templist&nbsp;=&nbsp;strline.split() &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;key&nbsp;=&nbsp;templist[</span><span class=number>0</span><span>] &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;templist[</span><span class=number>4</span><span>]&nbsp;+&nbsp;</span><span class=string>'&nbsp;'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;re.search(</span><span class=string>"^&nbsp;*inet&nbsp;"</span><span>,&nbsp;strline): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;templist&nbsp;=&nbsp;strline.split() &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;newnetcard[key]&nbsp;+&nbsp;templist[</span><span class=number>1</span><span>][</span><span class=number>5</span><span>:]&nbsp;+&nbsp;</span><span class=string>'&nbsp;'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;key,value&nbsp;</span><span class=keyword>in</span><span>&nbsp;newnetcard.items(): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;os.system(</span><span class=string>'ethtool&nbsp;%s&nbsp;&gt;&nbsp;ethtooltemp'</span><span>%(key)) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file&nbsp;=&nbsp;open(</span><span class=string>'./ethtooltemp'</span><span>,</span><span class=string>'r'</span><span>); &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempethtooltemplist&nbsp;=&nbsp;file.read().split(</span><span class=string>'\n'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file.close &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;re.search(</span><span class=string>"yes"</span><span>,&nbsp;tempethtooltemplist[-</span><span class=number>1</span><span>]): &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;newnetcard[key]&nbsp;+&nbsp;</span><span class=string>'runing!'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>else</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;newnetcard[key]&nbsp;+&nbsp;</span><span class=string>'stop!'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;file.close() &nbsp;&nbsp;</span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;key,value&nbsp;</span><span class=keyword>in</span><span>&nbsp;netcard.items(): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempsplit&nbsp;=&nbsp;value.split(</span><span class=string>'\n'</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;</span><span class=string>''</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;item&nbsp;</span><span class=keyword>in</span><span>&nbsp;tempsplit: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;item&nbsp;=&nbsp;item&nbsp;+&nbsp;</span><span class=string>'&lt;br&gt;'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=comment>#print&nbsp;item </span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;netcard[key]&nbsp;+&nbsp;item &nbsp;&nbsp;</span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempcount&nbsp;=&nbsp;</span><span class=number>1</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>for</span><span>&nbsp;match&nbsp;</span><span class=keyword>in</span><span>&nbsp;re.finditer(</span><span class=string>"(bytes:)(.*?)(&nbsp;\()"</span><span>,&nbsp;item): &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>if</span><span>&nbsp;tempcount&nbsp;==&nbsp;</span><span class=number>1</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;RX[key]&nbsp;=&nbsp;match.group(</span><span class=number>2</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;tempcount&nbsp;=&nbsp;tempcount&nbsp;+&nbsp;</span><span class=number>1</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>elif</span><span>&nbsp;tempcount&nbsp;==&nbsp;</span><span class=number>2</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;TX[key]&nbsp;=&nbsp;match.group(</span><span class=number>2</span><span>) &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;netcard[key]&nbsp;=&nbsp;netcard[key]&nbsp;+&nbsp;</span><span class=string>'net&nbsp;io&nbsp;percent(bytes/s):&nbsp;0&nbsp;&lt;br&gt;'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=""><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;newnetcard[key]&nbsp;=&nbsp;newnetcard[key]&nbsp;+&nbsp;</span><span class=string>'&nbsp;'</span><span>&nbsp;+&nbsp;</span><span class=string>'0&nbsp;&lt;br&gt;'</span><span>&nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>return</span><span>&nbsp;zip([</span><span class=string>'order'</span><span>],&nbsp;[</span><span class=string>'48'</span><span>])&nbsp;+&nbsp;newnetcard.items(); &nbsp;&nbsp;</span></span></li>
    <li class=""><span></span><span class=keyword>if</span><span>&nbsp;__name__&nbsp;==&nbsp;</span><span class=string>'__main__'</span><span>: &nbsp;&nbsp;</span></span></li>
    <li class=alt><span>&nbsp;&nbsp;&nbsp;&nbsp;</span><span class=keyword>print</span><span>&nbsp;run()&nbsp;&nbsp;</span></span></li>
</ol>
</div>
<textarea class=python style="DISPLAY: none" name=code rows=15 cols=50>#coding=utf-8
#!/usr/bin/python
import re
import os
import time
import utils
def sortedDictValues3(adict):
keys = adict.keys()
keys.sort()
return map(adict.get, keys)
def run():
if utils.isLinux() == False:
return [('ifconfig_collect os type error','this is windows')]
#not first run
if os.path.isfile('./oldifconfig'):
fileold = open('./oldifconfig', 'r')
fileold.seek(0)
#读入上次记录的临时流量数据文件，和时间戳
(oldtime, fileoldcontent) = fileold.read().split('#')
fileold.close;
netcard = {}
tempstr = ''
key = ''
for strline in fileoldcontent.split('\n'):
reobj = re.compile('^lo*.')
if reobj.search(strline):
break;
reobj = re.compile('^eth*.')
if reobj.search(strline):
key = strline.split()[0]
tempstr = tempstr + strline + '\n'
netcard[key] = tempstr
RXold = {}
TXold = {}
for key,value in netcard.items():
tempsplit = value.split('\n')
netcard[key] = ''
for item in tempsplit:
item = item + '&lt;br&gt;'
netcard[key] = netcard[key] + item
tempcount = 1
for match in re.finditer("(bytes:)(.*?)( \()", item):
if tempcount == 1:
RXold[key] = match.group(2)
tempcount = tempcount + 1
elif tempcount == 2:
TXold[key] = match.group(2)
netcard[key] = netcard[key] + 'net io percent(bytes/s): 0 &lt;br&gt;'
#记录当前网卡信息到临时文件中
os.system('ifconfig &gt; ifconfigtemp')
file = open('./ifconfigtemp','r');
fileold = open('./oldifconfig', 'w')
temptimestr = str(int(time.time()));
fileold.write(temptimestr)
fileold.write('#')
file.seek(0)
fileold.write(file.read())
fileold.close()
returnkeys = []
returnvalues = []
netcard = {}
tempcountcard = 0
file.seek(0)
key = ''
for strline in file.readlines():
reobj = re.compile('^lo*.')
if reobj.search(strline):
break;
reobj = re.compile('^eth*.')
if reobj.search(strline):
key = strline.split()[0]
netcard[key] = ''
netcard[key] = netcard[key] + strline
newnetcard = {}
file.seek(0)
key = ''
for strline in file.readlines():
reobj = re.compile('^lo*.')
if reobj.search(strline):
break;
if re.search("^eth", strline):
templist = strline.split()
key = templist[0]
newnetcard[key] = ''
newnetcard[key] = templist[4] + newnetcard[key] + ' '
if re.search("^ *inet ", strline):
templist = strline.split()
newnetcard[key] = templist[1][5:] + ' ' + newnetcard[key] + ' '
for key,value in newnetcard.items():
#记录每张网卡是否工作状态信息到临时文件
os.system('ethtool %s &gt; ethtooltemp'%(key))
file = open('./ethtooltemp','r');
tempethtooltemplist = file.read().split('\n\t')
file.close
if re.search("yes", tempethtooltemplist[-1]):
templist = newnetcard[key].split()
newnetcard[key] = templist[0] + ' runing! ' + templist[1]
else:
templist = newnetcard[key].split()
if len(templist) &gt; 1:
newnetcard[key] = templist[0] + ' stop! ' + templist[1]
else:
newnetcard[key] =  'stop! ' + templist[0]
file.close()
RX = {}
TX = {}
for key,value in netcard.items():
tempsplit = value.split('\n')
netcard[key] = ''
for item in tempsplit:
item = item + '&lt;br&gt;'
netcard[key] = netcard[key] + item
tempcount = 1
for match in re.finditer("(bytes:)(.*?)( \()", item):
if tempcount == 1:
RX[key] = str(int(match.group(2)) - int(RXold[key]))
tempcount = tempcount + 1
elif tempcount == 2:
TX[key] = str(int(match.group(2)) - int(TXold[key]))
divtime = float(int(time.time()) - int(oldtime))
if divtime == 0:
rate = (float(TX[key]) + float(RX[key]))
else:
rate = (float(TX[key]) + float(RX[key]))/(divtime)
if rate == 0:
newnetcard[key] = '0' + ' ' + newnetcard[key]
else:
newnetcard[key] = '%.2f'%rate + ' ' + newnetcard[key]
return zip(['order'], ['48']) + newnetcard.items();
else:
os.system('ifconfig &gt; ifconfigtemp')
file = open('./ifconfigtemp','r');
fileold = open('./oldifconfig', 'w')
temptimestr = str(int(time.time()));
fileold.write(temptimestr)
fileold.write('#')
file.seek(0)
fileold.write(file.read())
fileold.close()
netcard = {}
file.seek(0)
key = ''
for strline in file.readlines():
reobj = re.compile('^lo*.')
if reobj.search(strline):
break;
reobj = re.compile('^eth*.')
if reobj.search(strline):
key = strline.split()[0]
netcard[key] = ''
netcard[key] = netcard[key] + strline
RX = {}
TX = {}
key = ''
newnetcard = {}
file.seek(0)
for strline in file.readlines():
reobj = re.compile('^lo*.')
if reobj.search(strline):
break;
if re.search("^eth", strline):
templist = strline.split()
key = templist[0]
newnetcard[key] = templist[4] + ' '
if re.search("^ *inet ", strline):
templist = strline.split()
newnetcard[key] = newnetcard[key] + templist[1][5:] + ' '
for key,value in newnetcard.items():
os.system('ethtool %s &gt; ethtooltemp'%(key))
file = open('./ethtooltemp','r');
tempethtooltemplist = file.read().split('\n')
file.close
if re.search("yes", tempethtooltemplist[-1]):
newnetcard[key] = newnetcard[key] + 'runing!'
else:
newnetcard[key] = newnetcard[key] + 'stop!'
file.close()
for key,value in netcard.items():
tempsplit = value.split('\n')
netcard[key] = ''
for item in tempsplit:
item = item + '&lt;br&gt;'
#print item
netcard[key] = netcard[key] + item
tempcount = 1
for match in re.finditer("(bytes:)(.*?)( \()", item):
if tempcount == 1:
RX[key] = match.group(2)
tempcount = tempcount + 1
elif tempcount == 2:
TX[key] = match.group(2)
netcard[key] = netcard[key] + 'net io percent(bytes/s): 0 &lt;br&gt;'
newnetcard[key] = newnetcard[key] + ' ' + '0 &lt;br&gt;'
return zip(['order'], ['48']) + newnetcard.items();
if __name__ == '__main__':
print run()</textarea>
<p>&#160;</p>
<p>使用例子：</p>
<p><img title=1 height=50 alt=1 src="http://hi.csdn.net/attachment/201003/13/6723_1268462711ylrk.png" width=1092>&nbsp;</p>
<p>每一个列表元素元组里面第二个元素第一个字段为网速 Bytes/S，例如eth1网卡的网速就是3.3KB/s，eth0网速是2.9KB/s，今天是周六这个流量很正常</p>
<img src ="http://www.cppblog.com/jrckkyy/aggbug/109754.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/jrckkyy/" target="_blank">学者站在巨人的肩膀上</a> 2010-03-15 19:22 <a href="http://www.cppblog.com/jrckkyy/archive/2010/03/15/109754.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>自顶向下学搜索引擎——北大天网搜索引擎TSE分析及完全注释[6]倒排索引的建立的程序分析(4)</title><link>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102949.html</link><dc:creator>学者站在巨人的肩膀上</dc:creator><author>学者站在巨人的肩膀上</author><pubDate>Thu, 10 Dec 2009 15:03:00 GMT</pubDate><guid>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102949.html</guid><wfw:comment>http://www.cppblog.com/jrckkyy/comments/102949.html</wfw:comment><comments>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102949.html#Feedback</comments><slash:comments>3</slash:comments><wfw:commentRss>http://www.cppblog.com/jrckkyy/comments/commentRss/102949.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/jrckkyy/services/trackbacks/102949.html</trackback:ping><description><![CDATA[<p>以下是根据正向索引建立倒排索引的注释</p>
<p>&nbsp;</p>
<p>int main(int argc, char* argv[])&nbsp;&nbsp;&nbsp; //./CrtInvertedIdx moon.fidx.sort &gt; sun.iidx <br>{ <br>&nbsp;&nbsp;&nbsp; ifstream ifsImgInfo(argv[1]); <br>&nbsp;&nbsp;&nbsp; if (!ifsImgInfo)&nbsp; <br>&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cerr &lt;&lt; "Cannot open " &lt;&lt; argv[1] &lt;&lt; " for input\n"; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return -1; <br>&nbsp;&nbsp;&nbsp; } </p>
<p>&nbsp;&nbsp;&nbsp; string strLine,strDocNum,tmp1=""; <br>&nbsp;&nbsp;&nbsp; int cnt = 0; <br>&nbsp;&nbsp;&nbsp; while (getline(ifsImgInfo, strLine))&nbsp; <br>&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string::size_type idx; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string tmp; </p>
<p><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; idx = strLine.find("\t"); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tmp = strLine.substr(0,idx); </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (tmp.size()&lt;2 || tmp.size() &gt; 8) continue; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (tmp1.empty()) tmp1=tmp; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (tmp == tmp1)&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strDocNum = strDocNum + " " + strLine.substr(idx+1); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ( strDocNum.empty() ) <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strDocNum = strDocNum + " " + strLine.substr(idx+1); </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cout &lt;&lt; tmp1 &lt;&lt; "\t" &lt;&lt; strDocNum &lt;&lt; endl; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; tmp1 = tmp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strDocNum.clear(); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strDocNum = strDocNum + " " + strLine.substr(idx+1); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cnt++; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //if (cnt==100) break; <br>&nbsp;&nbsp;&nbsp; } <br>&nbsp;&nbsp;&nbsp; cout &lt;&lt; tmp1 &lt;&lt; "\t" &lt;&lt; strDocNum &lt;&lt; endl;&nbsp; //倒排索引中每个字典单词后的文档编号以table键为间隔 </p>
<p>&nbsp;&nbsp;&nbsp; return 0; <br>} </p>
<p>&nbsp;</p>
<p><a href="http://blog.csdn.net/jrckkyy/archive/2008/10/16/3086417.aspx"></a>&nbsp;</p>
<img src ="http://www.cppblog.com/jrckkyy/aggbug/102949.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/jrckkyy/" target="_blank">学者站在巨人的肩膀上</a> 2009-12-10 23:03 <a href="http://www.cppblog.com/jrckkyy/archive/2009/12/10/102949.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>自顶向下学搜索引擎——北大天网搜索引擎TSE分析及完全注释[6]倒排索引的建立的程序分析(3) </title><link>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102948.html</link><dc:creator>学者站在巨人的肩膀上</dc:creator><author>学者站在巨人的肩膀上</author><pubDate>Thu, 10 Dec 2009 15:02:00 GMT</pubDate><guid>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102948.html</guid><wfw:comment>http://www.cppblog.com/jrckkyy/comments/102948.html</wfw:comment><comments>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102948.html#Feedback</comments><slash:comments>1</slash:comments><wfw:commentRss>http://www.cppblog.com/jrckkyy/comments/commentRss/102948.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/jrckkyy/services/trackbacks/102948.html</trackback:ping><description><![CDATA[<p>这里介绍正向索引的建立，如果直接建立倒排索引效率上可能会很低，所以可以先产生正向索引为后面的倒排索引打下基础。</p>
<p>&nbsp;</p>
<p>详细的文件功能和介绍都在这里有了介绍自顶向下学搜索引擎——北大天网搜索引擎TSE分析及完全注释[5]倒排索引的建立及文件介绍</p>
<p>&nbsp;</p>
<p>CrtForwardIdx.cpp文件</p>
<p>&nbsp;</p>
<p>int main(int argc, char* argv[])&nbsp;&nbsp;&nbsp; //./CrtForwardIdx Tianwang.raw.***.seg &gt; moon.fidx <br>{ <br>&nbsp;&nbsp;&nbsp; ifstream ifsImgInfo(argv[1]); <br>&nbsp;&nbsp;&nbsp; if (!ifsImgInfo)&nbsp; <br>&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cerr &lt;&lt; "Cannot open " &lt;&lt; argv[1] &lt;&lt; " for input\n"; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return -1; <br>&nbsp;&nbsp;&nbsp; } </p>
<p>&nbsp;&nbsp;&nbsp; string strLine,strDocNum; <br>&nbsp;&nbsp;&nbsp; int cnt = 0; <br>&nbsp;&nbsp;&nbsp; while (getline(ifsImgInfo, strLine))&nbsp; <br>&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string::size_type idx; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cnt++; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (cnt%2 == 1) //奇数行为文档编号 <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strDocNum = strLine.substr(0,strLine.size()); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; continue; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (strLine[0]=='\0' || strLine[0]=='#' || strLine[0]=='\n') <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; continue; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while ( (idx = strLine.find(SEPARATOR)) != string::npos ) //指定查找分界符 <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string tmp1 = strLine.substr(0,idx); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cout &lt;&lt; tmp1 &lt;&lt; "\t" &lt;&lt; strDocNum &lt;&lt; endl; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strLine = strLine.substr(idx + SEPARATOR.size()); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //if (cnt==100) break; <br>&nbsp;&nbsp;&nbsp; } </p>
<p>&nbsp;&nbsp;&nbsp; return 0; <br>} </p>
<p>&nbsp;</p>
<p>author:http://hi.baidu.com/jrckkyy</p>
<p>author:http://blog.csdn.net/jrckkyy</p>
<p>&nbsp;</p>
<p><a href="http://blog.csdn.net/jrckkyy/archive/2008/10/16/3086187.aspx"></a>&nbsp;</p>
<img src ="http://www.cppblog.com/jrckkyy/aggbug/102948.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/jrckkyy/" target="_blank">学者站在巨人的肩膀上</a> 2009-12-10 23:02 <a href="http://www.cppblog.com/jrckkyy/archive/2009/12/10/102948.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>自顶向下学搜索引擎——北大天网搜索引擎TSE分析及完全注释[6]倒排索引的建立的程序分析(2)</title><link>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102947.html</link><dc:creator>学者站在巨人的肩膀上</dc:creator><author>学者站在巨人的肩膀上</author><pubDate>Thu, 10 Dec 2009 15:02:00 GMT</pubDate><guid>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102947.html</guid><wfw:comment>http://www.cppblog.com/jrckkyy/comments/102947.html</wfw:comment><comments>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102947.html#Feedback</comments><slash:comments>1</slash:comments><wfw:commentRss>http://www.cppblog.com/jrckkyy/comments/commentRss/102947.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/jrckkyy/services/trackbacks/102947.html</trackback:ping><description><![CDATA[<p>前面的DocIndex程序输入一个Tianwang.raw.*****文件，会产生一下三个文件 Doc.idx, Url.idx, DocId2Url.idx，我们这里对DocSegment程序进行分析。</p>
<p>这里输入 Tianwang.raw.*****，Doc.idx，Url.idx.sort_uniq等三个文件，输出一个Tianwang.raw.***.seg 分词完毕的文件</p>
<p>int main(int argc, char* argv[]) <br>{ <br>&nbsp;&nbsp;&nbsp; string strLine, strFileName=argv[1]; <br>&nbsp;&nbsp;&nbsp; CUrl iUrl; <br>&nbsp;&nbsp;&nbsp; vector&lt;CUrl&gt; vecCUrl; <br>&nbsp;&nbsp;&nbsp; CDocument iDocument; <br>&nbsp;&nbsp;&nbsp; vector&lt;CDocument&gt; vecCDocument; <br>&nbsp;&nbsp;&nbsp; unsigned int docId = 0; </p>
<p>&nbsp;&nbsp;&nbsp; //ifstream ifs("Tianwang.raw.2559638448"); <br>&nbsp;&nbsp;&nbsp; ifstream ifs(strFileName.c_str());&nbsp; //DocSegment Tianwang.raw.**** <br>&nbsp;&nbsp;&nbsp; if (!ifs)&nbsp; <br>&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cerr &lt;&lt; "Cannot open tianwang.img.info for input\n"; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return -1; <br>&nbsp;&nbsp;&nbsp; } </p>
<p>&nbsp;&nbsp;&nbsp; ifstream ifsUrl("Url.idx.sort_uniq");&nbsp;&nbsp; //排序并消重后的url字典 <br>&nbsp;&nbsp;&nbsp; if (!ifsUrl)&nbsp; <br>&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cerr &lt;&lt; "Cannot open Url.idx.sort_uniq for input\n"; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return -1; <br>&nbsp;&nbsp;&nbsp; } <br>&nbsp;&nbsp;&nbsp; ifstream ifsDoc("Doc.idx"); //字典文件 <br>&nbsp;&nbsp;&nbsp; if (!ifsDoc)&nbsp; <br>&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; cerr &lt;&lt; "Cannot open Doc.idx for input\n"; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return -1; <br>&nbsp;&nbsp;&nbsp; } </p>
<p>&nbsp;&nbsp;&nbsp; while (getline(ifsUrl,strLine)) //偏离url字典存入一个向量内存中 <br>&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; char chksum[33]; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int&nbsp; docid; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; memset(chksum, 0, 33); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sscanf( strLine.c_str(), "%s%d", chksum, &amp;docid ); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iUrl.m_sChecksum = chksum; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iUrl.m_nDocId = docid; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; vecCUrl.push_back(iUrl); <br>&nbsp;&nbsp;&nbsp; } </p>
<p>&nbsp;&nbsp;&nbsp; while (getline(ifsDoc,strLine))&nbsp;&nbsp;&nbsp;&nbsp; //偏离字典文件将其放入一个向量内存中 <br>&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int docid,pos,length; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; char chksum[33]; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; memset(chksum, 0, 33); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; sscanf( strLine.c_str(), "%d%d%d%s", &amp;docid, &amp;pos, &amp;length,chksum ); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iDocument.m_nDocId = docid; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iDocument.m_nPos = pos; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iDocument.m_nLength = length; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iDocument.m_sChecksum = chksum; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; vecCDocument.push_back(iDocument); <br>&nbsp;&nbsp;&nbsp; } </p>
<p>&nbsp;</p>
<p>&nbsp;&nbsp;&nbsp; strFileName += ".seg"; <br>&nbsp;&nbsp;&nbsp; ofstream fout(strFileName.c_str(), ios::in|ios::out|ios::trunc|ios::binary);&nbsp;&nbsp;&nbsp; //设置完成分词后的数据输出文件 <br>&nbsp;&nbsp;&nbsp; for ( docId=0; docId&lt;MAX_DOC_ID; docId++ ) <br>&nbsp;&nbsp;&nbsp; { </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // find document according to docId <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int length = vecCDocument[docId+1].m_nPos - vecCDocument[docId].m_nPos -1; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; char *pContent = new char[length+1]; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; memset(pContent, 0, length+1); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ifs.seekg(vecCDocument[docId].m_nPos); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ifs.read(pContent, length); </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; char *s; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; s = pContent; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // skip Head <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int bytesRead = 0,newlines = 0; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while (newlines != 2 &amp;&amp; bytesRead != HEADER_BUF_SIZE-1)&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (*s == '\n') <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; newlines++; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; newlines = 0; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; s++; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bytesRead++; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (bytesRead == HEADER_BUF_SIZE-1) continue; </p>
<p><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // skip header <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bytesRead = 0,newlines = 0; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while (newlines != 2 &amp;&amp; bytesRead != HEADER_BUF_SIZE-1)&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (*s == '\n') <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; newlines++; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; newlines = 0; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; s++; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bytesRead++; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (bytesRead == HEADER_BUF_SIZE-1) continue; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //iDocument.m_sBody = s; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iDocument.RemoveTags(s);&nbsp;&nbsp;&nbsp; //去除&lt;&gt; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iDocument.m_sBodyNoTags = s; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; delete[] pContent; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string strLine = iDocument.m_sBodyNoTags; </p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CStrFun::ReplaceStr(strLine, " ", " "); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CStrFun::EmptyStr(strLine); // set " \t\r\n" to " " </p>
<p><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // segment the document 具体分词处理 <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CHzSeg iHzSeg; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strLine = iHzSeg.SegmentSentenceMM(iDict,strLine); <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; fout &lt;&lt; docId &lt;&lt; endl &lt;&lt; strLine; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; fout &lt;&lt; endl; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; } </p>
<p>&nbsp;&nbsp;&nbsp; return(0); <br>}<br>这里只是浮光掠影式的过一遍大概的代码，后面我会有专题详细讲解 parse html 和 segment docment 等技术</p>
<p>&nbsp;</p>
<p><a href="http://blog.csdn.net/jrckkyy/archive/2008/10/16/3086048.aspx"></a>&nbsp;</p>
<img src ="http://www.cppblog.com/jrckkyy/aggbug/102947.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/jrckkyy/" target="_blank">学者站在巨人的肩膀上</a> 2009-12-10 23:02 <a href="http://www.cppblog.com/jrckkyy/archive/2009/12/10/102947.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>自顶向下学搜索引擎——北大天网搜索引擎TSE分析及完全注释[6]倒排索引的建立的程序分析(1)</title><link>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102945.html</link><dc:creator>学者站在巨人的肩膀上</dc:creator><author>学者站在巨人的肩膀上</author><pubDate>Thu, 10 Dec 2009 15:00:00 GMT</pubDate><guid>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102945.html</guid><wfw:comment>http://www.cppblog.com/jrckkyy/comments/102945.html</wfw:comment><comments>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102945.html#Feedback</comments><slash:comments>1</slash:comments><wfw:commentRss>http://www.cppblog.com/jrckkyy/comments/commentRss/102945.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/jrckkyy/services/trackbacks/102945.html</trackback:ping><description><![CDATA[<p>author:http://hi.baidu.com/jrckkyy</p>
<p>author:http://blog.csdn.net/jrckkyy</p>
<p>上一篇主要介绍了倒排索引建立相关的文件及中间文件。<br>TSE建立索引在运行程序上的大致步骤可以简化分为以下几步：</p>
<p>1、运行命令#./DocIndex<br>会用到一个文件 tianwang.raw.520&nbsp;&nbsp;&nbsp; //爬取回来的原始文件，包含多个网页的所有信息，所以很大，这也是一个有待解决的问题，到底存成大文件（如果过大会超过2G或4G的限制，而且文件过大索引效率过低）还是小文件（文件数过多用于打开关闭文件句柄的消耗过大）还有待思考，还就是存储方案的解决最终肯定是要存为分布式的，最终总文件量肯定是会上TB的，TSE只支持小型的搜索引擎需求。&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>会产生一下三个文件 Doc.idx, Url.idx, DocId2Url.idx&nbsp;&nbsp;&nbsp; //Data文件夹中的Doc.idx DocId2Url.idx和Doc.idx</p>
<p>2、运行命令#sort Url.idx|uniq &gt; Url.idx.sort_uniq&nbsp;&nbsp;&nbsp; //Data文件夹中的Url.idx.sort_uniq<br>会用到一个文件 Url.idx文件 //md5 hash 之后的url完整地址和document id值对<br>会产生一个文件 Url.idx.sort_uniq //URL消重，md5 hash排序，提高检索效率</p>
<p>3、运行命令#./DocSegment Tianwang.raw.2559638448&nbsp; <br>会用到一个文件 Tianwang.raw.2559638448&nbsp; //Tianwang.raw.2559638448为爬回来的文件 ，每个页面包含http头，分词为后面建立到排索引做准备<br>会产生一个文件 Tianwang.raw.2559638448.seg //分词文件，由一行document id号和一行文档分词组（只对每个文档&lt;html&gt;&lt;/html&gt;中&lt;head&gt;&lt;/head&gt;&lt;body&gt;&lt;/body&gt;等文字标记中的文本进行分组）构成</p>
<p>4、运行命令#./CrtForwardIdx Tianwang.raw.2559638448.seg &gt; moon.fidx //建立独立的正向索引</p>
<p>5、运行命令<br>#set | grep "LANG"<br>#LANG=en; export LANG;<br>#sort moon.fidx &gt; moon.fidx.sort</p>
<p>6、运行命令#./CrtInvertedIdx moon.fidx.sort &gt; sun.iidx //建立倒排索引</p>
<p>我们先从建立索引的第一个程序DocIndex.cpp开始分析。(注释约定：Tianwang.raw.2559638448是抓回来合并成的大文件，后面就叫大文件，里面包含了很多篇html文档，里面的文档有规律的分隔就叫做一篇一篇的文档)</p>
<p><br>//DocIndex.h start-------------------------------------------------------------</p>
<p>&nbsp;</p>
<p><br>#ifndef _COMM_H_040708_<br>#define _COMM_H_040708_</p>
<p>#include</p>
<p>#include <br>#include <br>#include <br>#include <br>#include <br>#include <br>#include </p>
<p><br>using namespace std;</p>
<p>const unsigned HEADER_BUF_SIZE = 1024;<br>const unsigned RstPerPage = 20;&nbsp;//前台搜索结果数据集返回条数</p>
<p>//iceway<br>//const unsigned MAX_DOC_IDX_ID = 21312;&nbsp;&nbsp;//DocSegment.cpp中要用到<br>const unsigned MAX_DOC_IDX_ID = 22104;</p>
<p><br>//const string IMG_INFO_NAME("./Data/s1.1");<br>const string INF_INFO_NAME("./Data/sun.iidx");&nbsp;//倒排索引文件<br>//朱德&nbsp; 14383 16151 16151 16151 1683 207 6302 7889 8218 8218 8637<br>//朱古力&nbsp; 1085 1222</p>
<p>//9万多条 字元文件 包括特殊符号，标点，汉字<br>const string DOC_IDX_NAME("./Data/Doc.idx");&nbsp;//倒排索引文件<br>const string RAWPAGE_FILE_NAME("./Data/Tianwang.swu.iceway.1.0");</p>
<p>//iceway<br>const string DOC_FILE_NAME = "Tianwang.swu.iceway.1.0";&nbsp;&nbsp;//Docindex.cpp中要用到<br>const string Data_DOC_FILE_NAME = "./Data/Tianwang.swu.iceway.1.0";&nbsp;&nbsp;//Snapshot.cpp中要用到</p>
<p><br>//const string RM_THUMBNAIL_FILES("rm -f ~/public_html/ImgSE/timg/*");</p>
<p>//const string THUMBNAIL_DIR("/ImgSE/timg/");</p>
<p><br>#endif _COMM_H_040708_<br>//DocIndex.h end--------------------------------------------------------------//DocIndex.cpp start-----------------------------------------------------------</p>
<p>#include <br>#include <br>#include "Md5.h"<br>#include "Url.h"<br>#include "Document.h"</p>
<p>//iceway(mnsc)<br>#include "Comm.h"<br>#include </p>
<p>using namespace std;</p>
<p>int main(int argc, char* argv[])<br>{<br>&nbsp;&nbsp;&nbsp; //ifstream ifs("Tianwang.raw.2559638448");<br>&nbsp;//ifstream ifs("Tianwang.raw.3023555472");<br>&nbsp;//iceway(mnsc)<br>&nbsp;ifstream ifs(DOC_FILE_NAME.c_str());&nbsp;//打开Tianwang.raw.3023555472文件，最原始的文件<br>&nbsp;if (!ifs) <br>&nbsp;{<br>&nbsp;&nbsp;&nbsp; &nbsp;cerr &lt;&lt; "Cannot open " &lt;&lt; "tianwang.img.info" &lt;&lt; " for input\n";<br>&nbsp;&nbsp;&nbsp; &nbsp;return -1;<br>&nbsp;&nbsp;&nbsp; }<br>&nbsp;ofstream ofsUrl("Url.idx", ios::in|ios::out|ios::trunc|ios::binary);&nbsp;//建立并打开Url.idx文件<br>&nbsp;if( !ofsUrl )<br>&nbsp;{<br>&nbsp;&nbsp;cout &lt;&lt; "error open file " &lt;&lt; endl;<br>&nbsp;} </p>
<p>&nbsp;ofstream ofsDoc("Doc.idx", ios::in|ios::out|ios::trunc|ios::binary);&nbsp;//建立并打开Doc.idx文件<br>&nbsp;if( !ofsDoc )<br>&nbsp;{<br>&nbsp;&nbsp;cout &lt;&lt; "error open file " &lt;&lt; endl;<br>&nbsp;} </p>
<p>&nbsp;ofstream ofsDocId2Url("DocId2Url.idx", ios::in|ios::out|ios::trunc|ios::binary);&nbsp;//建立并打开DocId2Url.idx文件<br>&nbsp;if( !ofsDocId2Url )<br>&nbsp;{<br>&nbsp;&nbsp;cout &lt;&lt; "error open file " &lt;&lt; endl;<br>&nbsp;} </p>
<p>&nbsp;int cnt=0;&nbsp;//文档编号从0开始计算<br>&nbsp;string strLine,strPage;<br>&nbsp;CUrl iUrl;<br>&nbsp;CDocument iDocument;<br>&nbsp;CMD5 iMD5;<br>&nbsp;<br>&nbsp;int nOffset = ifs.tellg();<br>&nbsp;while (getline(ifs, strLine)) <br>&nbsp;{<br>&nbsp;&nbsp;if (strLine[0]=='\0' || strLine[0]=='#' || strLine[0]=='\n')<br>&nbsp;&nbsp;{<br>&nbsp;&nbsp;&nbsp;nOffset = ifs.tellg();<br>&nbsp;&nbsp;&nbsp;continue;<br>&nbsp;&nbsp;}</p>
<p>&nbsp;&nbsp;if (!strncmp(strLine.c_str(), "version: 1.0", 12))&nbsp;//判断第一行是否是version: 1.0如果是就解析下去<br>&nbsp;&nbsp;{&nbsp;<br>&nbsp;&nbsp;&nbsp;if(!getline(ifs, strLine)) break;<br>&nbsp;&nbsp;&nbsp;if (!strncmp(strLine.c_str(), "url: ", 4))&nbsp;//判断第二行是否是url: 如果是则解析下去<br>&nbsp;&nbsp;&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;iUrl.m_sUrl = strLine.substr(5);&nbsp;//截取url: 五个字符之后的url内容<br>&nbsp;&nbsp;&nbsp;&nbsp;iMD5.GenerateMD5( (unsigned char*)iUrl.m_sUrl.c_str(), iUrl.m_sUrl.size() );&nbsp;//对url用md5 hash处理<br>&nbsp;&nbsp;&nbsp;&nbsp;iUrl.m_sChecksum = iMD5.ToString();&nbsp;//将字符数组组合成字符串这个函数在Md5.h中实现</p>
<p>&nbsp;&nbsp;&nbsp;} else <br>&nbsp;&nbsp;&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;continue;<br>&nbsp;&nbsp;&nbsp;}</p>
<p>&nbsp;&nbsp;&nbsp;while (getline(ifs, strLine)) <br>&nbsp;&nbsp;&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;if (!strncmp(strLine.c_str(), "length: ", 8))&nbsp;//一直读下去直到判断澹澹(相对第五行)惺欠袷莑ength: 是则接下下去<br>&nbsp;&nbsp;&nbsp;&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sscanf(strLine.substr(8).c_str(), "%d", &amp;(iDocument.m_nLength));&nbsp;//将该块所代表网页的实际网页内容长度放入iDocument数据结构中<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;break;<br>&nbsp;&nbsp;&nbsp;&nbsp;}<br>&nbsp;&nbsp;&nbsp;}</p>
<p>&nbsp;&nbsp;&nbsp;getline(ifs, strLine);&nbsp;//跳过相对第六行故意留的一个空行</p>
<p>&nbsp;&nbsp;&nbsp;iDocument.m_nDocId = cnt;&nbsp;//将文档编号赋值到iDocument数据结构中<br>&nbsp;&nbsp;&nbsp;iDocument.m_nPos = nOffset;&nbsp;//文档结尾在大文件中的结束行号<br>&nbsp;&nbsp;&nbsp;char *pContent = new char[iDocument.m_nLength+1];&nbsp;//新建该文档长度的字符串指针</p>
<p>&nbsp;&nbsp;&nbsp;memset(pContent, 0, iDocument.m_nLength+1);&nbsp;//每一位初始化为0<br>&nbsp;&nbsp;&nbsp;ifs.read(pContent, iDocument.m_nLength);&nbsp;//根据获得的文档长度读取澹(其中包含协议头)读取文档内容<br>&nbsp;&nbsp;&nbsp;iMD5.GenerateMD5( (unsigned char*)pContent, iDocument.m_nLength );<br>&nbsp;&nbsp;&nbsp;iDocument.m_sChecksum = iMD5.ToString();&nbsp;//将字符数组组合成字符串这个函数在Md5.h中实现<br>&nbsp;&nbsp;&nbsp;<br>&nbsp;&nbsp;&nbsp;delete[] pContent;<br>&nbsp;&nbsp;&nbsp;<br>&nbsp;&nbsp;&nbsp;ofsUrl &lt;&lt; iUrl.m_sChecksum ;&nbsp;//将md5hash后的url写入Url.idx文件<br>&nbsp;&nbsp;&nbsp;ofsUrl &lt;&lt; "\t" &lt;&lt; iDocument.m_nDocId &lt;&lt; endl;&nbsp;//在一行中一个tab距离分隔，将文件编号写入Url.idx文件</p>
<p>&nbsp;&nbsp;&nbsp;ofsDoc &lt;&lt; iDocument.m_nDocId ;&nbsp;//将文件编号写入Doc.idx文件<br>&nbsp;&nbsp;&nbsp;ofsDoc &lt;&lt; "\t" &lt;&lt; iDocument.m_nPos ;&nbsp;//在一行中一个tab距离分隔，将该文档结束行号澹(同样也是下一文档开始行号)写入Doc.idx文件<br>&nbsp;&nbsp;&nbsp;//ofsDoc &lt;&lt; "\t" &lt;&lt; iDocument.m_nLength ;<br>&nbsp;&nbsp;&nbsp;ofsDoc &lt;&lt; "\t" &lt;&lt; iDocument.m_sChecksum &lt;&lt; endl;&nbsp;//在一行中一个tab距离分隔，将md5hash后的url写入Doc.idx文件</p>
<p>&nbsp;&nbsp;&nbsp;ofsDocId2Url &lt;&lt; iDocument.m_nDocId ;&nbsp;//将文件编号写入DocId2Url.idx文件<br>&nbsp;&nbsp;&nbsp;ofsDocId2Url &lt;&lt; "\t" &lt;&lt; iUrl.m_sUrl &lt;&lt; endl;&nbsp;//将该文档的完整url写入DocId2Url.idx文件</p>
<p>&nbsp;&nbsp;&nbsp;cnt++;&nbsp;//文档编号加一说明该以文档分析完毕，生成下一文档的编号<br>&nbsp;&nbsp;}</p>
<p>&nbsp;&nbsp;nOffset = ifs.tellg();</p>
<p>&nbsp;}</p>
<p>&nbsp;//最后一行只有文档号和上一篇文档结束号<br>&nbsp;ofsDoc &lt;&lt; cnt ; <br>&nbsp;ofsDoc &lt;&lt; "\t" &lt;&lt; nOffset &lt;&lt; endl;</p>
<p><br>&nbsp;return(0);<br>}</p>
<p>//DocIndex.cpp end-----------------------------------------------------------author:http://hi.baidu.com/jrckkyy </p>
<p>author:http://blog.csdn.net/jrckkyy </p>
<p>&nbsp;</p>
<p><a href="http://blog.csdn.net/jrckkyy/archive/2008/07/30/2739710.aspx"></a>&nbsp;</p>
<img src ="http://www.cppblog.com/jrckkyy/aggbug/102945.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/jrckkyy/" target="_blank">学者站在巨人的肩膀上</a> 2009-12-10 23:00 <a href="http://www.cppblog.com/jrckkyy/archive/2009/12/10/102945.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>自顶向下学搜索引擎——北大天网搜索引擎TSE分析及完全注释[5]倒排索引的建立及文件介绍</title><link>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102943.html</link><dc:creator>学者站在巨人的肩膀上</dc:creator><author>学者站在巨人的肩膀上</author><pubDate>Thu, 10 Dec 2009 14:55:00 GMT</pubDate><guid>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102943.html</guid><wfw:comment>http://www.cppblog.com/jrckkyy/comments/102943.html</wfw:comment><comments>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102943.html#Feedback</comments><slash:comments>1</slash:comments><wfw:commentRss>http://www.cppblog.com/jrckkyy/comments/commentRss/102943.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/jrckkyy/services/trackbacks/102943.html</trackback:ping><description><![CDATA[<p>不好意思让大家久等了，前一阵一直在忙考试，终于结束了。呵呵！废话不多说了下面我们开始吧！</p>
<p>TSE用的是将抓取回来的网页文档全部装入一个大文档，让后对这一个大文档内的数据整体统一的建索引，其中包含了几个步骤。</p>
<p>view plaincopy to clipboardprint?<br>1.&nbsp; The document index (Doc.idx) keeps information about each document.&nbsp;&nbsp; <br>&nbsp; <br>It is a fixed width ISAM (Index sequential access mode) index, orderd by docID.&nbsp;&nbsp; <br>&nbsp; <br>The information stored in each entry includes a pointer into the repository,&nbsp;&nbsp; <br>&nbsp; <br>a document length, a document checksum.&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>//Doc.idx&nbsp; 文档编号 文档长度&nbsp;&nbsp;&nbsp; checksum hash码&nbsp;&nbsp; <br>&nbsp; <br>0&nbsp;&nbsp; 0&nbsp;&nbsp; bc9ce846d7987c4534f53d423380ba70&nbsp;&nbsp; <br>&nbsp; <br>1&nbsp;&nbsp; 76760&nbsp;&nbsp; 4f47a3cad91f7d35f4bb6b2a638420e5&nbsp;&nbsp; <br>&nbsp; <br>2&nbsp;&nbsp; 141624&nbsp; d019433008538f65329ae8e39b86026c&nbsp;&nbsp; <br>&nbsp; <br>3&nbsp;&nbsp; 142350&nbsp; 5705b8f58110f9ad61b1321c52605795&nbsp;&nbsp; <br>&nbsp; <br>//Doc.idx&nbsp;&nbsp; end&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; The url index (url.idx) is used to convert URLs into docIDs.&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>//url.idx&nbsp;&nbsp; <br>&nbsp; <br>5c36868a9c5117eadbda747cbdb0725f&nbsp;&nbsp;&nbsp; 0&nbsp; <br>&nbsp; <br>3272e136dd90263ee306a835c6c70d77&nbsp;&nbsp;&nbsp; 1&nbsp; <br>&nbsp; <br>6b8601bb3bb9ab80f868d549b5c5a5f3&nbsp;&nbsp;&nbsp; 2&nbsp; <br>&nbsp; <br>3f9eba99fa788954b5ff7f35a5db6e1f&nbsp;&nbsp;&nbsp; 3&nbsp; <br>&nbsp; <br>//url.idx&nbsp;&nbsp; end&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>It is a list of URL checksums with their corresponding docIDs and is sorted by&nbsp;&nbsp; <br>&nbsp; <br>checksum. In order to find the docID of a particular URL, the URL's checksum&nbsp;&nbsp; <br>&nbsp; <br>is computed and a binary search is performed on the checksums file to find its&nbsp;&nbsp; <br>&nbsp; <br>docID.&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; ./DocIndex&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; got Doc.idx, Url.idx, DocId2Url.idx //Data文件夹中的Doc.idx DocId2Url.idx和Doc.idx中&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>//DocId2Url.idx&nbsp;&nbsp; <br>&nbsp; <br>0&nbsp;&nbsp; <a href="http://*.*.edu.cn/index.aspx">http://*.*.edu.cn/index.aspx</a>&nbsp;&nbsp; <br>&nbsp; <br>1&nbsp;&nbsp; <a href="http://*.*.edu.cn/showcontent1.jsp?NewsID=118">http://*.*.edu.cn/showcontent1.jsp?NewsID=118</a>&nbsp;&nbsp; <br>&nbsp; <br>2&nbsp;&nbsp; <a href="http://*.*.edu.cn/0102.html">http://*.*.edu.cn/0102.html</a>&nbsp;&nbsp; <br>&nbsp; <br>3&nbsp;&nbsp; <a href="http://*.*.edu.cn/0103.html">http://*.*.edu.cn/0103.html</a>&nbsp;&nbsp; <br>&nbsp; <br>//DocId2Url.idx end&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>2.&nbsp; sort Url.idx|uniq &gt; Url.idx.sort_uniq&nbsp;&nbsp;&nbsp; //Data文件夹中的Url.idx.sort_uniq&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>//Url.idx.sort_uniq&nbsp;&nbsp; <br>&nbsp; <br>//对hash值进行排序&nbsp;&nbsp; <br>&nbsp; <br>000bfdfd8b2dedd926b58ba00d40986b&nbsp;&nbsp;&nbsp; 1111&nbsp; <br>&nbsp; <br>000c7e34b653b5135a2361c6818e48dc&nbsp;&nbsp;&nbsp; 1831&nbsp; <br>&nbsp; <br>0019d12f438eec910a06a606f570fde8&nbsp;&nbsp;&nbsp; 366&nbsp; <br>&nbsp; <br>0033f7c005ec776f67f496cd8bc4ae0d&nbsp;&nbsp;&nbsp; 2103&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>3. Segment document to terms, (with finding document according to the url)&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; ./DocSegment Tianwang.raw.2559638448&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //Tianwang.raw.2559638448为爬回来的文件 ，每个页面包含http头&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; got Tianwang.raw.2559638448.seg&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>//Tianwang.raw.2559638448&nbsp;&nbsp; 爬取的原始网页文件在文档内部每一个文档之间应该是通过version，&lt;/html&gt;和回车做标志位分割的&nbsp;&nbsp; <br>&nbsp; <br>version: 1.0&nbsp; <br>&nbsp; <br>url: <a href="http://***.105.138.175/Default2.asp?lang=gb">http://***.105.138.175/Default2.asp?lang=gb</a>&nbsp;&nbsp; <br>&nbsp; <br>origin: <a href="http://***.105.138.175/">http://***.105.138.175/</a>&nbsp;&nbsp; <br>&nbsp; <br>date: Fri, 23 May 2008 20:01:36 GMT&nbsp;&nbsp; <br>&nbsp; <br>ip: 162.105.138.175&nbsp; <br>&nbsp; <br>length: 38413&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>HTTP/1.1 200 OK&nbsp;&nbsp; <br>&nbsp; <br>Server: Microsoft-IIS/5.0&nbsp; <br>&nbsp; <br>Date: Fri, 23 May 2008 11:17:49 GMT&nbsp;&nbsp; <br>&nbsp; <br>Connection: keep-alive&nbsp;&nbsp; <br>&nbsp; <br>Connection: Keep-Alive&nbsp;&nbsp; <br>&nbsp; <br>Content-Length: 38088&nbsp; <br>&nbsp; <br>Content-Type: text/html; Charset=gb2312&nbsp;&nbsp; <br>&nbsp; <br>Expires: Fri, 23 May 2008 11:17:49 GMT&nbsp;&nbsp; <br>&nbsp; <br>Set-Cookie: ASPSESSIONIDSSTRDCAB=IMEOMBIAIPDFCKPAEDJFHOIH; path=/&nbsp;&nbsp; <br>&nbsp; <br>Cache-control: private&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"&nbsp; <br>&nbsp; <br>"<a href="http://www.w3.org/TR/html4/loose.dtd">http://www.w3.org/TR/html4/loose.dtd</a>"&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;html&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;head&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;title&gt;Apabi数字资源平台&lt;/title&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;meta http-equiv="Content-Type" content="text/html; charset=gb2312"&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;META NAME="ROBOTS" CONTENT="INDEX,NOFOLLOW"&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;META NAME="DESCRIPTION" CONTENT="数字图书馆 方正数字图书馆 电子图书 电子书 ebook e书 Apabi 数字资源平台"&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;link rel="stylesheet" type="text/css" href="css\common.css"&gt;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>&lt;style type="text/css"&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;!--&nbsp;&nbsp; <br>&nbsp; <br>.style4 {color: #666666}&nbsp;&nbsp; <br>&nbsp; <br>--&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;/style&gt;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>&lt;script LANGUAGE="vbscript"&gt;&nbsp;&nbsp; <br>&nbsp; <br>...&nbsp;&nbsp; <br>&nbsp; <br>&lt;/script&gt;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>&lt;Script Language="javascript"&gt;&nbsp;&nbsp; <br>&nbsp; <br>...&nbsp;&nbsp; <br>&nbsp; <br>&lt;/Script&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;/head&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;body leftmargin="0" topmargin="0"&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;/body&gt;&nbsp;&nbsp; <br>&nbsp; <br>&lt;/html&gt;&nbsp;&nbsp; <br>&nbsp; <br>//Tianwang.raw.2559638448&nbsp;&nbsp; end&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>//Tianwang.raw.2559638448.seg&nbsp;&nbsp; 将每个页面分成一行如下(注意中间没有回车作为分隔)&nbsp;&nbsp; <br>&nbsp; <br>1&nbsp; <br>&nbsp; <br>...&nbsp;&nbsp; <br>&nbsp; <br>...&nbsp;&nbsp; <br>&nbsp; <br>...&nbsp;&nbsp; <br>&nbsp; <br>2&nbsp; <br>&nbsp; <br>...&nbsp;&nbsp; <br>&nbsp; <br>...&nbsp;&nbsp; <br>&nbsp; <br>...&nbsp;&nbsp; <br>&nbsp; <br>//Tianwang.raw.2559638448.seg&nbsp;&nbsp; end&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>//下是 Tiny search 非必须因素&nbsp;&nbsp; <br>&nbsp; <br>4. Create forward index (docic--&gt;termid)&nbsp;&nbsp;&nbsp;&nbsp; //建立正向索引&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; ./CrtForwardIdx Tianwang.raw.2559638448.seg &gt; moon.fidx&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>//Tianwang.raw.2559638448.seg 将每个页面分成一行如下&lt;BR&gt;//分词&nbsp;&nbsp; DocID&lt;BR&gt;1&lt;BR&gt;三星/&nbsp; s/&nbsp; 手机/&nbsp; 论坛/&nbsp; ,/&nbsp; 手机/&nbsp; 铃声/&nbsp; 下载/&nbsp; ,/&nbsp; 手机/&nbsp; 图片/&nbsp; 下载/&nbsp; ,/&nbsp; 手机/&lt;BR&gt;2&lt;BR&gt;...&lt;BR&gt;...&lt;BR&gt;...&nbsp; </p>
<p>1.&nbsp; The document index (Doc.idx) keeps information about each document.</p>
<p>It is a fixed width ISAM (Index sequential access mode) index, orderd by docID.</p>
<p>The information stored in each entry includes a pointer into the repository,</p>
<p>a document length, a document checksum.</p>
<p>&nbsp;</p>
<p>//Doc.idx&nbsp; 文档编号&nbsp;文档长度&nbsp;checksum hash码</p>
<p>0&nbsp;0&nbsp;bc9ce846d7987c4534f53d423380ba70</p>
<p>1&nbsp;76760&nbsp;4f47a3cad91f7d35f4bb6b2a638420e5</p>
<p>2&nbsp;141624&nbsp;d019433008538f65329ae8e39b86026c</p>
<p>3&nbsp;142350&nbsp;5705b8f58110f9ad61b1321c52605795</p>
<p>//Doc.idx&nbsp;end</p>
<p>&nbsp;</p>
<p>&nbsp; The url index (url.idx) is used to convert URLs into docIDs.</p>
<p>&nbsp;</p>
<p>//url.idx</p>
<p>5c36868a9c5117eadbda747cbdb0725f&nbsp;0</p>
<p>3272e136dd90263ee306a835c6c70d77&nbsp;1</p>
<p>6b8601bb3bb9ab80f868d549b5c5a5f3&nbsp;2</p>
<p>3f9eba99fa788954b5ff7f35a5db6e1f&nbsp;3</p>
<p>//url.idx&nbsp;end</p>
<p>&nbsp;</p>
<p>It is a list of URL checksums with their corresponding docIDs and is sorted by</p>
<p>checksum. In order to find the docID of a particular URL, the URL's checksum</p>
<p>is computed and a binary search is performed on the checksums file to find its</p>
<p>docID.</p>
<p>&nbsp;</p>
<p>&nbsp;./DocIndex</p>
<p>&nbsp;&nbsp;got Doc.idx, Url.idx, DocId2Url.idx&nbsp;//Data文件夹中的Doc.idx DocId2Url.idx和Doc.idx中</p>
<p>&nbsp;</p>
<p>//DocId2Url.idx</p>
<p>0&nbsp;<a href="http://*.*.edu.cn/index.aspx">http://*.*.edu.cn/index.aspx</a></p>
<p>1&nbsp;<a href="http://*.*.edu.cn/showcontent1.jsp?NewsID=118">http://*.*.edu.cn/showcontent1.jsp?NewsID=118</a></p>
<p>2&nbsp;<a href="http://*.*.edu.cn/0102.html">http://*.*.edu.cn/0102.html</a></p>
<p>3&nbsp;<a href="http://*.*.edu.cn/0103.html">http://*.*.edu.cn/0103.html</a></p>
<p>//DocId2Url.idx&nbsp;end</p>
<p>&nbsp;</p>
<p>2.&nbsp; sort Url.idx|uniq &gt; Url.idx.sort_uniq&nbsp;//Data文件夹中的Url.idx.sort_uniq</p>
<p>&nbsp;</p>
<p>//Url.idx.sort_uniq</p>
<p>//对hash值进行排序</p>
<p>000bfdfd8b2dedd926b58ba00d40986b&nbsp;1111</p>
<p>000c7e34b653b5135a2361c6818e48dc&nbsp;1831</p>
<p>0019d12f438eec910a06a606f570fde8&nbsp;366</p>
<p>0033f7c005ec776f67f496cd8bc4ae0d&nbsp;2103</p>
<p>&nbsp;</p>
<p>3. Segment document to terms, (with finding document according to the url)</p>
<p>&nbsp;./DocSegment Tianwang.raw.2559638448&nbsp;&nbsp;//Tianwang.raw.2559638448为爬回来的文件 ，每个页面包含http头</p>
<p>&nbsp;&nbsp;got Tianwang.raw.2559638448.seg&nbsp;&nbsp;</p>
<p>&nbsp;</p>
<p>//Tianwang.raw.2559638448&nbsp;爬取的原始网页文件在文档内部每一个文档之间应该是通过version，&lt;/html&gt;和回车做标志位分割的</p>
<p>version: 1.0</p>
<p>url: <a href="http://***.105.138.175/Default2.asp?lang=gb">http://***.105.138.175/Default2.asp?lang=gb</a></p>
<p>origin: <a href="http://***.105.138.175/">http://***.105.138.175/</a></p>
<p>date: Fri, 23 May 2008 20:01:36 GMT</p>
<p>ip: 162.105.138.175</p>
<p>length: 38413</p>
<p>&nbsp;</p>
<p>HTTP/1.1 200 OK</p>
<p>Server: Microsoft-IIS/5.0</p>
<p>Date: Fri, 23 May 2008 11:17:49 GMT</p>
<p>Connection: keep-alive</p>
<p>Connection: Keep-Alive</p>
<p>Content-Length: 38088</p>
<p>Content-Type: text/html; Charset=gb2312</p>
<p>Expires: Fri, 23 May 2008 11:17:49 GMT</p>
<p>Set-Cookie: ASPSESSIONIDSSTRDCAB=IMEOMBIAIPDFCKPAEDJFHOIH; path=/</p>
<p>Cache-control: private</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"</p>
<p>"<a href="http://www.w3.org/TR/html4/loose.dtd">http://www.w3.org/TR/html4/loose.dtd</a>"&gt;</p>
<p>&lt;html&gt;</p>
<p>&lt;head&gt;</p>
<p>&lt;title&gt;Apabi数字资源平台&lt;/title&gt;</p>
<p>&lt;meta http-equiv="Content-Type" content="text/html; charset=gb2312"&gt;</p>
<p>&lt;META NAME="ROBOTS" CONTENT="INDEX,NOFOLLOW"&gt;</p>
<p>&lt;META NAME="DESCRIPTION" CONTENT="数字图书馆 方正数字图书馆 电子图书 电子书 ebook e书 Apabi 数字资源平台"&gt;</p>
<p>&lt;link rel="stylesheet" type="text/css" href="css\common.css"&gt;</p>
<p>&nbsp;</p>
<p>&lt;style type="text/css"&gt;</p>
<p>&lt;!--</p>
<p>.style4 {color: #666666}</p>
<p>--&gt;</p>
<p>&lt;/style&gt;</p>
<p>&nbsp;</p>
<p>&lt;script LANGUAGE="vbscript"&gt;</p>
<p>...</p>
<p>&lt;/script&gt;</p>
<p>&nbsp;</p>
<p>&lt;Script Language="javascript"&gt;</p>
<p>...</p>
<p>&lt;/Script&gt;</p>
<p>&lt;/head&gt;</p>
<p>&lt;body leftmargin="0" topmargin="0"&gt;</p>
<p>&lt;/body&gt;</p>
<p>&lt;/html&gt;</p>
<p>//Tianwang.raw.2559638448&nbsp;end</p>
<p>&nbsp;</p>
<p>//Tianwang.raw.2559638448.seg&nbsp;将每个页面分成一行如下(注意中间没有回车作为分隔)</p>
<p>1</p>
<p>...</p>
<p>...</p>
<p>...</p>
<p>2</p>
<p>...</p>
<p>...</p>
<p>...</p>
<p>//Tianwang.raw.2559638448.seg&nbsp;end</p>
<p>&nbsp;</p>
<p>//下是 Tiny search 非必须因素</p>
<p>4. Create forward index (docic--&gt;termid)&nbsp;&nbsp;//建立正向索引</p>
<p>&nbsp;./CrtForwardIdx Tianwang.raw.2559638448.seg &gt; moon.fidx</p>
<p>&nbsp;</p>
<p>//Tianwang.raw.2559638448.seg 将每个页面分成一行如下//分词&nbsp;&nbsp; DocID1三星/&nbsp; s/&nbsp; 手机/&nbsp; 论坛/&nbsp; ,/&nbsp; 手机/&nbsp; 铃声/&nbsp; 下载/&nbsp; ,/&nbsp; 手机/&nbsp; 图片/&nbsp; 下载/&nbsp; ,/&nbsp; 手机/2.........view plaincopy to clipboardprint?<br>//Tianwang.raw.2559638448.seg end&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>//moon.fidx&nbsp;&nbsp; <br>&nbsp; <br>//每篇文档号对应文档内分出来的&nbsp;&nbsp;&nbsp; 分词&nbsp; DocID&nbsp;&nbsp; <br>&nbsp; <br>都会&nbsp; 2391&nbsp; <br>&nbsp; <br>使&nbsp;&nbsp; 2391&nbsp; <br>&nbsp; <br>那些&nbsp; 2391&nbsp; <br>&nbsp; <br>拥有&nbsp; 2391&nbsp; <br>&nbsp; <br>它&nbsp;&nbsp; 2391&nbsp; <br>&nbsp; <br>的&nbsp;&nbsp; 2391&nbsp; <br>&nbsp; <br>人&nbsp;&nbsp; 2391&nbsp; <br>&nbsp; <br>的&nbsp;&nbsp; 2391&nbsp; <br>&nbsp; <br>视野&nbsp; 2391&nbsp; <br>&nbsp; <br>变&nbsp;&nbsp; 2391&nbsp; <br>&nbsp; <br>窄&nbsp;&nbsp; 2391&nbsp; <br>&nbsp; <br>在&nbsp;&nbsp; 2180&nbsp; <br>&nbsp; <br>研究生部&nbsp;&nbsp;&nbsp; 2180&nbsp; <br>&nbsp; <br>主页&nbsp; 2180&nbsp; <br>&nbsp; <br>培养&nbsp; 2180&nbsp; <br>&nbsp; <br>管理&nbsp; 2180&nbsp; <br>&nbsp; <br>栏目&nbsp; 2180&nbsp; <br>&nbsp; <br>下载&nbsp; 2180&nbsp; <br>&nbsp; <br>）&nbsp;&nbsp; 2180&nbsp; <br>&nbsp; <br>、&nbsp;&nbsp; 2180&nbsp; <br>&nbsp; <br>关于&nbsp; 2180&nbsp; <br>&nbsp; <br>做好&nbsp; 2180&nbsp; <br>&nbsp; <br>年&nbsp;&nbsp; 2180&nbsp; <br>&nbsp; <br>国家&nbsp; 2180&nbsp; <br>&nbsp; <br>公派&nbsp; 2180&nbsp; <br>&nbsp; <br>研究生 2180&nbsp; <br>&nbsp; <br>项目&nbsp; 2180&nbsp; <br>&nbsp; <br>//moon.fidx end&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>5.# set | grep "LANG"&nbsp; <br>&nbsp; <br>LANG=en; export LANG;&nbsp;&nbsp; <br>&nbsp; <br>sort moon.fidx &gt; moon.fidx.sort&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>6. Create inverted index (termid--&gt;docid)&nbsp;&nbsp;&nbsp; //建立倒排索引&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; ./CrtInvertedIdx moon.fidx.sort &gt; sun.iidx&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>//sun.iidx&nbsp; //文件规模大概减少1/2&nbsp;&nbsp; <br>&nbsp; <br>花工&nbsp;&nbsp; 236&nbsp; <br>&nbsp; <br>花海&nbsp;&nbsp; 2103&nbsp; <br>&nbsp; <br>花卉&nbsp;&nbsp; 1018 1061 1061 1061 1730 1730 1730 1730 1730 1852 949 949&nbsp; <br>&nbsp; <br>花蕾&nbsp;&nbsp; 447 447&nbsp; <br>&nbsp; <br>花木&nbsp;&nbsp; 1061&nbsp; <br>&nbsp; <br>花呢&nbsp;&nbsp; 1430&nbsp; <br>&nbsp; <br>花期&nbsp;&nbsp; 447 447 447 447 447 525&nbsp; <br>&nbsp; <br>花钱&nbsp;&nbsp; 174 236&nbsp; <br>&nbsp; <br>花色&nbsp;&nbsp; 1730 1730&nbsp; <br>&nbsp; <br>花色品种&nbsp;&nbsp;&nbsp;&nbsp; 1660&nbsp; <br>&nbsp; <br>花生&nbsp;&nbsp; 450 526&nbsp; <br>&nbsp; <br>花式&nbsp;&nbsp; 1428 1430 1430 1430&nbsp; <br>&nbsp; <br>花纹&nbsp;&nbsp; 1430 1430&nbsp; <br>&nbsp; <br>花序&nbsp;&nbsp; 447 447 447 447 447 450&nbsp; <br>&nbsp; <br>花絮&nbsp;&nbsp; 136 137&nbsp; <br>&nbsp; <br>花芽&nbsp;&nbsp; 450 450&nbsp; <br>&nbsp; <br>//sun.iidx&nbsp; end&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&nbsp; <br>TSESearch&nbsp;&nbsp; CGI program for query&nbsp;&nbsp; <br>&nbsp; <br>Snapshot&nbsp;&nbsp;&nbsp; CGI program for page snapshot&nbsp;&nbsp; <br>&nbsp; <br>&nbsp; <br>&lt;P&gt;&nbsp;&nbsp; <br>author:http://hi.baidu.com/jrckkyy&nbsp;&nbsp; <br>&nbsp; <br>author:http://blog.csdn.net/jrckkyy&nbsp;&nbsp; <br>&lt;/P&gt;&nbsp; </p>
<p><a href="http://blog.csdn.net/jrckkyy/archive/2008/07/05/2614693.aspx"></a>&nbsp;</p>
<img src ="http://www.cppblog.com/jrckkyy/aggbug/102943.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/jrckkyy/" target="_blank">学者站在巨人的肩膀上</a> 2009-12-10 22:55 <a href="http://www.cppblog.com/jrckkyy/archive/2009/12/10/102943.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>自顶向下学搜索引擎——北大天网搜索引擎TSE分析及完全注释[4]小结</title><link>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102942.html</link><dc:creator>学者站在巨人的肩膀上</dc:creator><author>学者站在巨人的肩膀上</author><pubDate>Thu, 10 Dec 2009 14:54:00 GMT</pubDate><guid>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102942.html</guid><wfw:comment>http://www.cppblog.com/jrckkyy/comments/102942.html</wfw:comment><comments>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102942.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/jrckkyy/comments/commentRss/102942.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/jrckkyy/services/trackbacks/102942.html</trackback:ping><description><![CDATA[<p>通过前面的三篇文章相信你已经对神秘的搜索引擎有了一个感性的认识，和普通的php类似的脚本语言服务器类似，通过获取前台关键字，通过字典分词，和事先建立建立好的倒排索引进行相关性分析，得出查询结构格式化输出结果。而这里的技术难点在于</p>
<p>1、字典的选取（事实上根据不同时代不同地方人们的语言习惯是不一样的所以说字典的最小元的取值是不同的）</p>
<p>2、倒排索引的建立（这里就要涉及到爬虫的抓取和索引的建立后面将重点介绍这2点，搜索引擎的效率和服务质量实效性瓶颈在这里）</p>
<p>3、相关性分析（对抓回来的文档分词建索引和用户关键字分词算法上要对应）</p>
<p>后面文章会重点介绍爬虫的抓取和索引的建立。</p>
<img src ="http://www.cppblog.com/jrckkyy/aggbug/102942.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/jrckkyy/" target="_blank">学者站在巨人的肩膀上</a> 2009-12-10 22:54 <a href="http://www.cppblog.com/jrckkyy/archive/2009/12/10/102942.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>自顶向下学搜索引擎——北大天网搜索引擎TSE分析及完全注释[3]来到关键字分词及相关性分析程序 </title><link>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102941.html</link><dc:creator>学者站在巨人的肩膀上</dc:creator><author>学者站在巨人的肩膀上</author><pubDate>Thu, 10 Dec 2009 14:53:00 GMT</pubDate><guid>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102941.html</guid><wfw:comment>http://www.cppblog.com/jrckkyy/comments/102941.html</wfw:comment><comments>http://www.cppblog.com/jrckkyy/archive/2009/12/10/102941.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/jrckkyy/comments/commentRss/102941.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/jrckkyy/services/trackbacks/102941.html</trackback:ping><description><![CDATA[<p>有前面注释我们可以知道查询关键字和字典文件准备好好后，将进入用户关键字分词阶段</p>
<p>//TSESearch.cpp中：</p>
<p>view plaincopy to clipboardprint?<br>CHzSeg iHzSeg;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //include ChSeg/HzSeg.h&nbsp;&nbsp; <br>&nbsp; <br>//&nbsp;&nbsp; <br>iQuery.m_sSegQuery = iHzSeg.SegmentSentenceMM(iDict, iQuery.m_sQuery);&nbsp; //将get到的查询变量分词分成 "我/&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 爱/&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 你们/ 的/&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 格式"&nbsp;&nbsp; <br>&nbsp; <br>vector&lt;STRING&gt;&lt;/STRING&gt; vecTerm;&nbsp;&nbsp; <br>iQuery.ParseQuery(vecTerm);&nbsp;&nbsp;&nbsp;&nbsp; //将以"/"划分开的关键字一一顺序放入一个向量容器中&nbsp;&nbsp; <br>&nbsp; <br>set&lt;STRING&gt;&lt;/STRING&gt; setRelevantRst;&nbsp;&nbsp;&nbsp; <br>iQuery.GetRelevantRst(vecTerm, mapBuckets, setRelevantRst);&nbsp;&nbsp;&nbsp; <br>&nbsp; <br>gettimeofday(&amp;end_tv,&amp;tz);&nbsp;&nbsp; <br>// search end&nbsp;&nbsp; <br>//搜索完毕&nbsp; </p>
<p>&nbsp;CHzSeg iHzSeg;&nbsp;&nbsp;//include ChSeg/HzSeg.h</p>
<p>&nbsp;//<br>&nbsp;iQuery.m_sSegQuery = iHzSeg.SegmentSentenceMM(iDict, iQuery.m_sQuery);&nbsp;//将get到的查询变量分词分成 "我/&nbsp;&nbsp;爱/&nbsp;&nbsp;你们/&nbsp;的/&nbsp;&nbsp;格式"<br>&nbsp;<br>&nbsp;vector vecTerm;<br>&nbsp;iQuery.ParseQuery(vecTerm);&nbsp;&nbsp;//将以"/"划分开的关键字一一顺序放入一个向量容器中<br>&nbsp;<br>&nbsp;set setRelevantRst; <br>&nbsp;iQuery.GetRelevantRst(vecTerm, mapBuckets, setRelevantRst); <br>&nbsp;<br>&nbsp;gettimeofday(&amp;end_tv,&amp;tz);<br>&nbsp;// search end<br>&nbsp;//搜索完毕view plaincopy to clipboardprint?<br>看CHzSeg 中的这个方法&nbsp; </p>
<p>看CHzSeg 中的这个方法view plaincopy to clipboardprint?<br>//ChSeg/HzSeg.h&nbsp; </p>
<p>//ChSeg/HzSeg.hview plaincopy to clipboardprint?<br>/**&nbsp;&nbsp; <br>&nbsp;* 程序翻译说明&nbsp;&nbsp; <br>&nbsp;* 进一步净化数据，转换汉字&nbsp;&nbsp; <br>&nbsp;* @access&nbsp; public&nbsp;&nbsp; <br>&nbsp;* @param&nbsp;&nbsp; CDict, string 参数的汉字说明:字典，查询字符串&nbsp;&nbsp; <br>&nbsp;* @return&nbsp; string 0&nbsp;&nbsp; <br>&nbsp;*/&nbsp;&nbsp; <br>// process a sentence before segmentation&nbsp;&nbsp; <br>//在分词前处理句子&nbsp;&nbsp; <br>string CHzSeg::SegmentSentenceMM (CDict &amp;dict, string s1) const&nbsp;&nbsp; <br>{&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; string s2="";&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; unsigned int i,len;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; while (!s1.empty())&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; unsigned char ch=(unsigned char) s1[0];&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if(ch&lt;128)&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; { // deal with ASCII&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i=1;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; len = s1.size();&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while (i&lt;LEN len="s1.length();" i="0;" 中文标点等非汉字字符="" if="" else="" yhf="" s1="s1.substr(i);" by="" added="" ch="=13)" s2="" cr=""&gt;&lt;/LEN&gt;=161)&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &amp;&amp; (!((unsigned char)s1[i]==161 &amp;&amp; ((unsigned char)s1[i+1]&gt;=162 &amp;&amp; (unsigned char)s1[i+1]&lt;=168)))&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &amp;&amp; (!((unsigned char)s1[i]==161 &amp;&amp; ((unsigned char)s1[i+1]&gt;=171 &amp;&amp; (unsigned char)s1[i+1]&lt;=191)))&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &amp;&amp; (!((unsigned char)s1[i]==163 &amp;&amp; ((unsigned char)s1[i+1]==172 || (unsigned char)s1[i+1]==161)&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; || (unsigned char)s1[i+1]==168 || (unsigned char)s1[i+1]==169 || (unsigned char)s1[i+1]==186&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; || (unsigned char)s1[i+1]==187 || (unsigned char)s1[i+1]==191)))&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ii=i+2; // 假定没有半个汉字&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (i==0) ii=i+2;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // 不处理中文空格&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (!(ch==161 &amp;&amp; (unsigned char)s1[1]==161))&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (i &lt;= s1.size())&nbsp; // yhf&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // 其他的非汉字双字节字符可能连续输出&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; s2 += s1.substr(0, i) + SEPARATOR;&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else break; // yhf&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (i &lt;= s1.size())&nbsp; // yhf&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; s1s1=s1.substr(i);&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else break;&nbsp;&nbsp;&nbsp;&nbsp; //yhf&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; continue;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; // 以下处理汉字串&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i = 2;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; len = s1.length();&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while(i&lt;LEN&gt;&lt;/LEN&gt;=176)&nbsp;&nbsp;&nbsp; <br>//&nbsp;&nbsp;&nbsp; while(i&lt;LEN&gt;&lt;/LEN&gt;=128 &amp;&amp; (unsigned char)s1[i]!=161)&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i+=2;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; s2+=SegmentHzStrMM(dict, s1.substr(0,i));&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (i &lt;= len)&nbsp;&nbsp;&nbsp; // yhf&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; s1s1=s1.substr(i);&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else break; // yhf&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; return s2;&nbsp;&nbsp; <br>}&nbsp; </p>
<p>/**<br>&nbsp;* 程序翻译说明<br>&nbsp;* 进一步净化数据，转换汉字<br>&nbsp;* @access&nbsp; public<br>&nbsp;* @param&nbsp;&nbsp; CDict, string 参数的汉字说明:字典，查询字符串<br>&nbsp;* @return&nbsp; string 0<br>&nbsp;*/<br>// process a sentence before segmentation<br>//在分词前处理句子<br>string CHzSeg::SegmentSentenceMM (CDict &amp;dict, string s1) const<br>{<br>&nbsp;string s2="";<br>&nbsp;unsigned int i,len;</p>
<p>&nbsp;while (!s1.empty()) <br>&nbsp;{<br>&nbsp;&nbsp;unsigned char ch=(unsigned char) s1[0];<br>&nbsp;&nbsp;if(ch&lt;128) <br>&nbsp;&nbsp;{ // deal with ASCII<br>&nbsp;&nbsp;&nbsp;i=1;<br>&nbsp;&nbsp;&nbsp;len = s1.size();<br>&nbsp;&nbsp;&nbsp;while (i=161)<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &amp;&amp; (!((unsigned char)s1[i]==161 &amp;&amp; ((unsigned char)s1[i+1]&gt;=162 &amp;&amp; (unsigned char)s1[i+1]&lt;=168)))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &amp;&amp; (!((unsigned char)s1[i]==161 &amp;&amp; ((unsigned char)s1[i+1]&gt;=171 &amp;&amp; (unsigned char)s1[i+1]&lt;=191)))<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &amp;&amp; (!((unsigned char)s1[i]==163 &amp;&amp; ((unsigned char)s1[i+1]==172 || (unsigned char)s1[i+1]==161) <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; || (unsigned char)s1[i+1]==168 || (unsigned char)s1[i+1]==169 || (unsigned char)s1[i+1]==186<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; || (unsigned char)s1[i+1]==187 || (unsigned char)s1[i+1]==191))) <br>&nbsp;&nbsp;&nbsp;&nbsp;{ <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;i=i+2; // 假定没有半个汉字<br>&nbsp;&nbsp;&nbsp;&nbsp;}</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;if (i==0) i=i+2;</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;// 不处理中文空格<br>&nbsp;&nbsp;&nbsp;&nbsp;if (!(ch==161 &amp;&amp; (unsigned char)s1[1]==161)) <br>&nbsp;&nbsp;&nbsp;&nbsp;{ <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if (i &lt;= s1.size())&nbsp;// yhf<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;// 其他的非汉字双字节字符可能连续输出<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;s2 += s1.substr(0, i) + SEPARATOR; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;else break; // yhf<br>&nbsp;&nbsp;&nbsp;&nbsp;}</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;if (i &lt;= s1.size())&nbsp;// yhf<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;s1=s1.substr(i);<br>&nbsp;&nbsp;&nbsp;&nbsp;else break;&nbsp;&nbsp;//yhf</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;continue;<br>&nbsp;&nbsp;&nbsp;}<br>&nbsp;&nbsp;}<br>&nbsp;&nbsp;&nbsp; </p>
<p>&nbsp;&nbsp;&nbsp; // 以下处理汉字串</p>
<p>&nbsp;&nbsp;i = 2;<br>&nbsp;&nbsp;len = s1.length();</p>
<p>&nbsp;&nbsp;while(i=176) <br>//&nbsp;&nbsp;&nbsp; while(i=128 &amp;&amp; (unsigned char)s1[i]!=161)<br>&nbsp;&nbsp;&nbsp;i+=2;</p>
<p>&nbsp;&nbsp;s2+=SegmentHzStrMM(dict, s1.substr(0,i));</p>
<p>&nbsp;&nbsp;if (i &lt;= len)&nbsp;// yhf<br>&nbsp;&nbsp;&nbsp;s1=s1.substr(i);<br>&nbsp;&nbsp;else break;&nbsp;// yhf<br>&nbsp;}</p>
<p>&nbsp;return s2;<br>}view plaincopy to clipboardprint?<br>&nbsp;&nbsp; </p>
<p>&nbsp;view plaincopy to clipboardprint?<br>//Query.cpp&nbsp; </p>
<p>//Query.cppview plaincopy to clipboardprint?<br>&lt;PRE class=csharp name="code"&gt;/**&nbsp;&nbsp; <br>&nbsp;* 程序翻译说明&nbsp;&nbsp; <br>&nbsp;* 将以"/"划分开的关键字一一顺序放入一个向量容器中&nbsp;&nbsp; <br>&nbsp;*&nbsp;&nbsp; <br>&nbsp;* @access&nbsp; public&nbsp;&nbsp; <br>&nbsp;* @param&nbsp;&nbsp; vector&lt;STRING&gt;&lt;/STRING&gt; 参数的汉字说明：向量容器&nbsp;&nbsp; <br>&nbsp;* @return&nbsp; void&nbsp;&nbsp; <br>&nbsp;*/&nbsp;&nbsp; <br>void CQuery::ParseQuery(vector&lt;STRING&gt;&lt;/STRING&gt; &amp;vecTerm)&nbsp;&nbsp; <br>{&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; string::size_type idx;&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; while ( (idx = m_sSegQuery.find("/&nbsp; ")) != string::npos ) {&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; vecTerm.push_back(m_sSegQuery.substr(0,idx));&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; m_sSegQuerym_sSegQuery = m_sSegQuery.substr(idx+3);&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>}&nbsp;&nbsp; <br>&lt;/PRE&gt;&nbsp; <br>&lt;PRE class=csharp name="code"&gt; &lt;/PRE&gt;&nbsp; <br>&lt;PRE class=csharp name="code"&gt;&lt;PRE class=csharp name="code"&gt;/**&nbsp;&nbsp; <br>&nbsp;* 程序翻译说明&nbsp;&nbsp; <br>&nbsp;* 相关性分析查询，构造结果集合setRelevantRst //瓶颈所在&nbsp;&nbsp; <br>&nbsp;*&nbsp;&nbsp; <br>&nbsp;* @access&nbsp; public&nbsp;&nbsp; <br>&nbsp;* @param&nbsp;&nbsp; vector&lt;STRING&gt;&lt;/STRING&gt; map set&lt;STRING&gt;&lt;/STRING&gt; 参数的汉字说明： 用户提交关键字的分词组，倒排索引映射，相关性结果集合&nbsp;&nbsp; <br>&nbsp;* @return&nbsp; string 0&nbsp;&nbsp; <br>&nbsp;*/&nbsp;&nbsp; <br>bool CQuery::GetRelevantRst&nbsp;&nbsp; <br>(&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; vector&lt;STRING&gt;&lt;/STRING&gt; &amp;vecTerm,&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; map &amp;mapBuckets,&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; set&lt;STRING&gt;&lt;/STRING&gt; &amp;setRelevantRst&nbsp;&nbsp; <br>) const&nbsp;&nbsp; <br>{&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; set&lt;STRING&gt;&lt;/STRING&gt; setSRst;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; bool bFirst=true;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; vector&lt;STRING&gt;&lt;/STRING&gt;::iterator itTerm = vecTerm.begin();&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; for ( ; itTerm != vecTerm.end(); ++itTerm )&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setSRst.clear();&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; copy(setRelevantRst.begin(), setRelevantRst.end(), inserter(setSRst,setSRst.begin()));&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map mapRstDoc;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string docid;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int doccnt;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator itBuckets = mapBuckets.find(*itTerm);&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (itBuckets != mapBuckets.end())&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string strBucket = (*itBuckets).second;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string::size_type idx;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; idx = strBucket.find_first_not_of(" ");&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strBucketstrBucket = strBucket.substr(idx);&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while ( (idx = strBucket.find(" ")) != string::npos )&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; docid = strBucket.substr(0,idx);&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = 0;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (docid.empty()) continue;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator it = mapRstDoc.find(docid);&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ( it != mapRstDoc.end() )&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = (*it).second + 1;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.erase(it);&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.insert( pair(docid,doccnt) );&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strBucketstrBucket = strBucket.substr(idx+1);&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // remember the last one&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; docid = strBucket;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = 0;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator it = mapRstDoc.find(docid);&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ( it != mapRstDoc.end() )&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = (*it).second + 1;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.erase(it);&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.insert( pair(docid,doccnt) );&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // sort by term frequencty&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; multimap &gt; newRstDoc;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator it0 = mapRstDoc.begin();&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for ( ; it0 != mapRstDoc.end(); ++it0 ){&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; newRstDoc.insert( pair((*it0).second,(*it0).first) );&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; multimap::iterator itNewRstDoc = newRstDoc.begin();&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setRelevantRst.clear();&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for ( ; itNewRstDoc != newRstDoc.end(); ++itNewRstDoc ){&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string docid = (*itNewRstDoc).second;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (bFirst==true) {&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setRelevantRst.insert(docid);&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; continue;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ( setSRst.find(docid) != setSRst.end() ){&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setRelevantRst.insert(docid);&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //cout &lt;&lt; "setRelevantRst.size(): " &lt;&lt; setRelevantRst.size() &lt;&lt; "&lt;BR&gt;";&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bFirst = false;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; return true;&nbsp;&nbsp; <br>}&lt;/PRE&gt;&nbsp; <br>&lt;/PRE&gt;&nbsp; <br>接下来的就是现实了，前面都只是处理数据得到 setRelevantRst 这个查询结构集合,这里就不多说了下面就和php之类的脚本语言差不多，格式化结果集合并显示出来。&nbsp; </p>
<p>view plaincopy to clipboardprint?/**&nbsp;&nbsp; * 程序翻译说明&nbsp;&nbsp; * 将以"/"划分开的关键字一一顺序放入一个向量容器中&nbsp;&nbsp; *&nbsp;&nbsp; * @access&nbsp; public&nbsp;&nbsp; * @param&nbsp;&nbsp; vector&lt;STRING&gt;&lt;/STRING&gt; 参数的汉字说明：向量容器&nbsp;&nbsp; * @return&nbsp; void&nbsp;&nbsp; */&nbsp; void CQuery::ParseQuery(vector&lt;STRING&gt;&lt;/STRING&gt; &amp;vecTerm)&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string::size_type idx;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while ( (idx = m_sSegQuery.find("/&nbsp; ")) != string::npos ) {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; vecTerm.push_back(m_sSegQuery.substr(0,idx));&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; m_sSegQuery = m_sSegQuery.substr(idx+3);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp; }&nbsp; /**<br>&nbsp;* 程序翻译说明<br>&nbsp;* 将以"/"划分开的关键字一一顺序放入一个向量容器中<br>&nbsp;*<br>&nbsp;* @access&nbsp; public<br>&nbsp;* @param&nbsp;&nbsp; vector 参数的汉字说明：向量容器<br>&nbsp;* @return&nbsp; void<br>&nbsp;*/<br>void CQuery::ParseQuery(vector &amp;vecTerm)<br>{<br>&nbsp;string::size_type idx; <br>&nbsp;while ( (idx = m_sSegQuery.find("/&nbsp; ")) != string::npos ) { <br>&nbsp;&nbsp;vecTerm.push_back(m_sSegQuery.substr(0,idx)); <br>&nbsp;&nbsp;m_sSegQuery = m_sSegQuery.substr(idx+3); <br>&nbsp;}<br>}</p>
<p>view plaincopy to clipboardprint?&nbsp;&nbsp;&nbsp; <br>view plaincopy to clipboardprint?&lt;PRE class=csharp name="code"&gt;/**&nbsp;&nbsp; * 程序翻译说明&nbsp;&nbsp; * 相关性分析查询，构造结果集合setRelevantRst //瓶颈所在&nbsp;&nbsp; *&nbsp;&nbsp; * @access&nbsp; public&nbsp;&nbsp; * @param&nbsp;&nbsp; vector&lt;STRING&gt;&lt;/STRING&gt; map set&lt;STRING&gt;&lt;/STRING&gt; 参数的汉字说明： 用户提交关键字的分词组，倒排索引映射，相关性结果集合&nbsp;&nbsp; * @return&nbsp; string 0&nbsp;&nbsp; */&nbsp; bool CQuery::GetRelevantRst&nbsp;&nbsp; (&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; vector&lt;STRING&gt;&lt;/STRING&gt; &amp;vecTerm,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map &amp;mapBuckets,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; set&lt;STRING&gt;&lt;/STRING&gt; &amp;setRelevantRst&nbsp;&nbsp; ) const&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; set&lt;STRING&gt;&lt;/STRING&gt; setSRst;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bool bFirst=true;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; vector&lt;STRING&gt;&lt;/STRING&gt;::iterator itTerm = vecTerm.begin();&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for ( ; itTerm != vecTerm.end(); ++itTerm )&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setSRst.clear();&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; copy(setRelevantRst.begin(), setRelevantRst.end(), inserter(setSRst,setSRst.begin()));&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map mapRstDoc;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string docid;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int doccnt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator itBuckets = mapBuckets.find(*itTerm);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (itBuckets != mapBuckets.end())&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string strBucket = (*itBuckets).second;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string::size_type idx;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; idx = strBucket.find_first_not_of(" ");&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strBucket = strBucket.substr(idx);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while ( (idx = strBucket.find(" ")) != string::npos )&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; docid = strBucket.substr(0,idx);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = 0;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (docid.empty()) continue;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator it = mapRstDoc.find(docid);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ( it != mapRstDoc.end() )&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = (*it).second + 1;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.erase(it);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.insert( pair(docid,doccnt) );&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strBucket = strBucket.substr(idx+1);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // remember the last one&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; docid = strBucket;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = 0;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator it = mapRstDoc.find(docid);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ( it != mapRstDoc.end() )&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = (*it).second + 1;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.erase(it);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.insert( pair(docid,doccnt) );&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // sort by term frequencty&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; multimap &gt; newRstDoc;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator it0 = mapRstDoc.begin();&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for ( ; it0 != mapRstDoc.end(); ++it0 ){&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; newRstDoc.insert( pair((*it0).second,(*it0).first) );&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; multimap::iterator itNewRstDoc = newRstDoc.begin();&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setRelevantRst.clear();&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for ( ; itNewRstDoc != newRstDoc.end(); ++itNewRstDoc ){&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string docid = (*itNewRstDoc).second;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (bFirst==true) {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setRelevantRst.insert(docid);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; continue;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ( setSRst.find(docid) != setSRst.end() ){&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setRelevantRst.insert(docid);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //cout &lt;&lt; "setRelevantRst.size(): " &lt;&lt; setRelevantRst.size() &lt;&lt; "&lt;BR&gt;";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bFirst = false;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return true;&nbsp;&nbsp; }&lt;/PRE&gt;&nbsp; view plaincopy to clipboardprint?/**&nbsp;&nbsp; * 程序翻译说明&nbsp;&nbsp; * 相关性分析查询，构造结果集合setRelevantRst //瓶颈所在&nbsp;&nbsp; *&nbsp;&nbsp; * @access&nbsp; public&nbsp;&nbsp; * @param&nbsp;&nbsp; vector&lt;STRING&gt;&lt;/STRING&gt; map set&lt;STRING&gt;&lt;/STRING&gt; 参数的汉字说明： 用户提交关键字的分词组，倒排索引映射，相关性结果集合&nbsp;&nbsp; * @return&nbsp; string 0&nbsp;&nbsp; */&nbsp; bool CQuery::GetRelevantRst&nbsp;&nbsp; (&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; vector&lt;STRING&gt;&lt;/STRING&gt; &amp;vecTerm,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map &amp;mapBuckets,&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; set&lt;STRING&gt;&lt;/STRING&gt; &amp;setRelevantRst&nbsp;&nbsp; ) const&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; set&lt;STRING&gt;&lt;/STRING&gt; setSRst;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bool bFirst=true;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; vector&lt;STRING&gt;&lt;/STRING&gt;::iterator itTerm = vecTerm.begin();&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for ( ; itTerm != vecTerm.end(); ++itTerm )&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setSRst.clear();&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; copy(setRelevantRst.begin(), setRelevantRst.end(), inserter(setSRst,setSRst.begin()));&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map mapRstDoc;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string docid;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int doccnt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator itBuckets = mapBuckets.find(*itTerm);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (itBuckets != mapBuckets.end())&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string strBucket = (*itBuckets).second;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string::size_type idx;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; idx = strBucket.find_first_not_of(" ");&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strBucket = strBucket.substr(idx);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; while ( (idx = strBucket.find(" ")) != string::npos )&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; docid = strBucket.substr(0,idx);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = 0;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (docid.empty()) continue;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator it = mapRstDoc.find(docid);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ( it != mapRstDoc.end() )&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = (*it).second + 1;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.erase(it);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.insert( pair(docid,doccnt) );&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; strBucket = strBucket.substr(idx+1);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // remember the last one&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; docid = strBucket;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = 0;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator it = mapRstDoc.find(docid);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ( it != mapRstDoc.end() )&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; doccnt = (*it).second + 1;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.erase(it);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; mapRstDoc.insert( pair(docid,doccnt) );&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // sort by term frequencty&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; multimap &gt; newRstDoc;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; map::iterator it0 = mapRstDoc.begin();&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for ( ; it0 != mapRstDoc.end(); ++it0 ){&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; newRstDoc.insert( pair((*it0).second,(*it0).first) );&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; multimap::iterator itNewRstDoc = newRstDoc.begin();&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setRelevantRst.clear();&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for ( ; itNewRstDoc != newRstDoc.end(); ++itNewRstDoc ){&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; string docid = (*itNewRstDoc).second;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (bFirst==true) {&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setRelevantRst.insert(docid);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; continue;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ( setSRst.find(docid) != setSRst.end() ){&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setRelevantRst.insert(docid);&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; //cout &lt;&lt; "setRelevantRst.size(): " &lt;&lt; setRelevantRst.size() &lt;&lt; "&lt;BR&gt;";&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bFirst = false;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return true;&nbsp;&nbsp; }&nbsp; /**<br>&nbsp;* 程序翻译说明<br>&nbsp;* 相关性分析查询，构造结果集合setRelevantRst&nbsp;//瓶颈所在<br>&nbsp;*<br>&nbsp;* @access&nbsp; public<br>&nbsp;* @param&nbsp;&nbsp; vector map set 参数的汉字说明： 用户提交关键字的分词组，倒排索引映射，相关性结果集合<br>&nbsp;* @return&nbsp; string 0<br>&nbsp;*/<br>bool CQuery::GetRelevantRst<br>(<br>&nbsp;vector &amp;vecTerm, <br>&nbsp;map &amp;mapBuckets, <br>&nbsp;set &amp;setRelevantRst<br>) const<br>{<br>&nbsp;set setSRst;</p>
<p>&nbsp;bool bFirst=true;<br>&nbsp;vector::iterator itTerm = vecTerm.begin();</p>
<p>&nbsp;for ( ; itTerm != vecTerm.end(); ++itTerm )<br>&nbsp;{</p>
<p>&nbsp;&nbsp;setSRst.clear();<br>&nbsp;&nbsp;copy(setRelevantRst.begin(), setRelevantRst.end(), inserter(setSRst,setSRst.begin()));</p>
<p>&nbsp;&nbsp;map mapRstDoc;<br>&nbsp;&nbsp;string docid;<br>&nbsp;&nbsp;int doccnt;</p>
<p>&nbsp;&nbsp;map::iterator itBuckets = mapBuckets.find(*itTerm);<br>&nbsp;&nbsp;if (itBuckets != mapBuckets.end())<br>&nbsp;&nbsp;{<br>&nbsp;&nbsp;&nbsp;string strBucket = (*itBuckets).second;<br>&nbsp;&nbsp;&nbsp;string::size_type idx;<br>&nbsp;&nbsp;&nbsp;idx = strBucket.find_first_not_of(" ");<br>&nbsp;&nbsp;&nbsp;strBucket = strBucket.substr(idx);</p>
<p>&nbsp;&nbsp;&nbsp;while ( (idx = strBucket.find(" ")) != string::npos ) <br>&nbsp;&nbsp;&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;docid = strBucket.substr(0,idx);<br>&nbsp;&nbsp;&nbsp;&nbsp;doccnt = 0;</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;if (docid.empty()) continue;</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;map::iterator it = mapRstDoc.find(docid);<br>&nbsp;&nbsp;&nbsp;&nbsp;if ( it != mapRstDoc.end() )<br>&nbsp;&nbsp;&nbsp;&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;doccnt = (*it).second + 1;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;mapRstDoc.erase(it);<br>&nbsp;&nbsp;&nbsp;&nbsp;}<br>&nbsp;&nbsp;&nbsp;&nbsp;mapRstDoc.insert( pair(docid,doccnt) );</p>
<p>&nbsp;&nbsp;&nbsp;&nbsp;strBucket = strBucket.substr(idx+1);<br>&nbsp;&nbsp;&nbsp;}</p>
<p>&nbsp;&nbsp;&nbsp;// remember the last one<br>&nbsp;&nbsp;&nbsp;docid = strBucket;<br>&nbsp;&nbsp;&nbsp;doccnt = 0;<br>&nbsp;&nbsp;&nbsp;map::iterator it = mapRstDoc.find(docid);<br>&nbsp;&nbsp;&nbsp;if ( it != mapRstDoc.end() )<br>&nbsp;&nbsp;&nbsp;{<br>&nbsp;&nbsp;&nbsp;&nbsp;doccnt = (*it).second + 1;<br>&nbsp;&nbsp;&nbsp;&nbsp;mapRstDoc.erase(it);<br>&nbsp;&nbsp;&nbsp;}<br>&nbsp;&nbsp;&nbsp;mapRstDoc.insert( pair(docid,doccnt) );<br>&nbsp;&nbsp;}</p>
<p>&nbsp;&nbsp;// sort by term frequencty<br>&nbsp;&nbsp;multimap &gt; newRstDoc;<br>&nbsp;&nbsp;map::iterator it0 = mapRstDoc.begin();<br>&nbsp;&nbsp;for ( ; it0 != mapRstDoc.end(); ++it0 ){<br>&nbsp;&nbsp;&nbsp;newRstDoc.insert( pair((*it0).second,(*it0).first) );<br>&nbsp;&nbsp;}</p>
<p>&nbsp;&nbsp;multimap::iterator itNewRstDoc = newRstDoc.begin();<br>&nbsp;&nbsp;setRelevantRst.clear();<br>&nbsp;&nbsp;for ( ; itNewRstDoc != newRstDoc.end(); ++itNewRstDoc ){<br>&nbsp;&nbsp;&nbsp;string docid = (*itNewRstDoc).second;</p>
<p>&nbsp;&nbsp;&nbsp;if (bFirst==true) {<br>&nbsp;&nbsp;&nbsp;&nbsp;setRelevantRst.insert(docid);<br>&nbsp;&nbsp;&nbsp;&nbsp;continue;<br>&nbsp;&nbsp;&nbsp;}</p>
<p>&nbsp;&nbsp;&nbsp;if ( setSRst.find(docid) != setSRst.end() ){&nbsp;<br>&nbsp;&nbsp;&nbsp;&nbsp;setRelevantRst.insert(docid);<br>&nbsp;&nbsp;&nbsp;}<br>&nbsp;&nbsp;}</p>
<p>&nbsp;&nbsp;//cout &lt;&lt; "setRelevantRst.size(): " &lt;&lt; setRelevantRst.size() &lt;&lt; "";<br>&nbsp;&nbsp;bFirst = false;<br>&nbsp;}<br>&nbsp;return true;<br>}</p>
<p>接下来的就是现实了，前面都只是处理数据得到 setRelevantRst 这个查询结构集合,这里就不多说了下面就和php之类的脚本语言差不多，格式化结果集合并显示出来。<br>//TSESearch.cpp</p>
<p>view plaincopy to clipboardprint?<br>//下面开始显示&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; CDisplayRst iDisplayRst;&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp; iDisplayRst.ShowTop();&nbsp;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; float used_msec = (end_tv.tv_sec-begin_tv.tv_sec)*1000&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; +((float)(end_tv.tv_usec-begin_tv.tv_usec))/(float)1000;&nbsp;&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; iDisplayRst.ShowMiddle(iQuery.m_sQuery,used_msec,&nbsp;&nbsp;&nbsp; <br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; setRelevantRst.size(), iQuery.m_iStart);&nbsp;&nbsp; <br>&nbsp; <br>&nbsp;&nbsp;&nbsp; iDisplayRst.ShowBelow(vecTerm,setRelevantRst,vecDocIdx,iQuery.m_iStart); </p>
<p><a href="http://blog.csdn.net/jrckkyy/archive/2008/06/03/2508524.aspx"></a>&nbsp;</p>
<img src ="http://www.cppblog.com/jrckkyy/aggbug/102941.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/jrckkyy/" target="_blank">学者站在巨人的肩膀上</a> 2009-12-10 22:53 <a href="http://www.cppblog.com/jrckkyy/archive/2009/12/10/102941.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>