﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>C++博客-凤之焚的博客</title><link>http://www.cppblog.com/phenix-burn/</link><description>静者,无澜也.净者,无贪也.无贪无澜者,海纳百川也!</description><language>zh-cn</language><lastBuildDate>Tue, 14 Apr 2026 23:09:05 GMT</lastBuildDate><pubDate>Tue, 14 Apr 2026 23:09:05 GMT</pubDate><ttl>60</ttl><item><title>获得Frame或IFrame中的IHTMLDocumnet2接口</title><link>http://www.cppblog.com/phenix-burn/archive/2006/09/05/12059.html</link><dc:creator>凤之焚</dc:creator><author>凤之焚</author><pubDate>Tue, 05 Sep 2006 13:07:00 GMT</pubDate><guid>http://www.cppblog.com/phenix-burn/archive/2006/09/05/12059.html</guid><wfw:comment>http://www.cppblog.com/phenix-burn/comments/12059.html</wfw:comment><comments>http://www.cppblog.com/phenix-burn/archive/2006/09/05/12059.html#Feedback</comments><slash:comments>3</slash:comments><wfw:commentRss>http://www.cppblog.com/phenix-burn/comments/commentRss/12059.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/phenix-burn/services/trackbacks/12059.html</trackback:ping><description><![CDATA[&nbsp;&nbsp;&nbsp;&nbsp; 摘要: 一种获得Frame或IFrame中的IHTMLDocumnet2的方法&nbsp;&nbsp;<a href='http://www.cppblog.com/phenix-burn/archive/2006/09/05/12059.html'>阅读全文</a><img src ="http://www.cppblog.com/phenix-burn/aggbug/12059.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/phenix-burn/" target="_blank">凤之焚</a> 2006-09-05 21:07 <a href="http://www.cppblog.com/phenix-burn/archive/2006/09/05/12059.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>网页源码过滤</title><link>http://www.cppblog.com/phenix-burn/archive/2006/08/29/11824.html</link><dc:creator>凤之焚</dc:creator><author>凤之焚</author><pubDate>Tue, 29 Aug 2006 08:43:00 GMT</pubDate><guid>http://www.cppblog.com/phenix-burn/archive/2006/08/29/11824.html</guid><wfw:comment>http://www.cppblog.com/phenix-burn/comments/11824.html</wfw:comment><comments>http://www.cppblog.com/phenix-burn/archive/2006/08/29/11824.html#Feedback</comments><slash:comments>2</slash:comments><wfw:commentRss>http://www.cppblog.com/phenix-burn/comments/commentRss/11824.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/phenix-burn/services/trackbacks/11824.html</trackback:ping><description><![CDATA[<p>
<strong>
<font size="3">本例通过Mime filter技术对网页源码进行过滤,本文部分摘自<a href="http://blog.csdn.net/lion_wing/articles/534716.aspx">《HTML代码过滤技术》</a></font>
</strong>
</p>
<span lang="EN-US">
<font face="Times New Roman">
<font size="3">
<span>
<span style="font-size: 10.5pt; font-family: 'Times New Roman';">
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>要实现HTML代码过滤必需注册一个或多个MIME过滤器（Pluggable MIME Filter）。MIME过滤器是一个COM对象，必需实现IInternetProtocolSink和IInternetProtocol接口。</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>在实现MIME过滤器对象之前，先看一下《Pluggable Protocols Overview》一文中关于MIME过滤器与WEB处理器（transaction handler，即urlmon.dll）之间接口的调用的描述（注：urlmon.dll内部实现了IInternetProtocol和IInternetProtocolSink接口）：</div>
<div>&nbsp;</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">1、&nbsp;WEB处理器调用MIME过滤器的IInternetProtocolRoot::Start方法（IInternetProtocol从IInternetProtocolRoot派生）；</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">2、&nbsp;WEB处理器先后调用MIME过滤器的IInternetProtocolSink::ReportProgress 和IInternetProtocolSink::ReportData方法；</div>
<div style="margin: 0cm 0cm 0pt 69pt; text-indent: -27pt;">3、<span style="font-family: 'Times New Roman'; font-style: normal; font-variant: normal; font-weight: normal; font-size: 7pt; line-height: normal; font-size-adjust: none; font-stretch: normal;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>MIME过滤器调用WEB处理器的IInternetProtocol::Read方法；</div>
<div style="margin: 0cm 0cm 0pt 42pt;">4、&nbsp;MIME过滤器调用WEB处理器的IInternetProtocolSink::ReportData方法；</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">5、&nbsp;WEB处理器调用MIME过滤器的IInternetProtoco::Read方法；</div>
<div>&nbsp;</div>
<div>因此，要实现MIME过滤器，有几个重要的方法：</div>
<div>1、IInternetProtocolRoot::Start方法：</div>
<div>HRESULT Start(</div>
<div>
<span>&nbsp;&nbsp;&nbsp; [in] LPCWSTR szUrl,</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp; [in] IInternetProtocolSink *pOIProtSink,</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp; [in] IInternetBindInfo *pOIBindInfo,</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp; [in] DWORD grfPI,</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp; [in] DWORD dwReserved</span>
</div>
<div>);</div>
<div>作为MIME过滤对象，szUrl传入的是MIME的类型（如果是name space handlers对象，则该参数为一个即将下载或解析的URL）。若是你想得到URL，可以通过pOIBindInfo 接口得到，下面是示例：</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; LPOLESTR pwzUrl ;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ULONG uElFetched ;</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pIBindInfo-&gt;GetBindString( BINDSTRING_URL , &amp;pwzUrl , 1 , &amp;uElFetched ) </span>；</div>
<div>pOIProtSink是由urlmon.dll提供的IInternetProtocolSink接口，因为在后面的处理过程中，需要调用到该接口，所以要将它保存；</div>
<div>grfPI是一个枚举变量，必需包含PI_FILTER_MODE标志，表示该对象运行在filter模式中。</div>
<div>dwReserved是一个指向PROTOCOLFILTERDATA结构的指针，该结构的pProtocol成员是由urlmon.dll提供的IInternetProtocol接口，因为在后面的处理过程中需要调用到该接口，所以要将它保存。实际上该接口也可以通过pOIProtSink参数调用QueryInterface得到，同样PROTOCOLFILTERDATA结构的pProtocolSink与pOIProtSink都是指向同一个接口。</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>在Start方法中，我们必需做的实际上只是保存urlmon.dll提供的IInternetProtocolSink</div>
<div>和IInternetProtocol接口。</div>
<div>&nbsp;</div>
<div>2、IInternetProtocolSink::ReportProgress方法：</div>
<div>HRESULT ReportProgress(</div>
<div>
<span>&nbsp;&nbsp;&nbsp; [in] ULONG ulStatusCode,</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp; [in] LPCWSTR szStatusText </span>）</div>
<div>作为MIME过滤器，ulStatusCode一般都是BINDSTATUS_CACHEFILENAMEAVAILABLE , 当ulStatusCode为BINDSTATUS_CACHEFILENAMEAVAILABLE时，szStatusText为临时缓存文件的路径名称，但有一些网页并不写到缓存里，所以szStatusText可能为空字符串。</div>
<div>&nbsp;</div>
<div>3、IInternetProtocolSink::ReportData方法：</div>
<div>HRESULT ReportData(</div>
<div>
<span>&nbsp;&nbsp;&nbsp; [in] DWORD grfBSCF,</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp; [in] ULONG ulProgress,</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp; [in] ULONG ulProgressMax</span>
</div>
<div>);</div>
<div>IE下载文件过程中或下载完毕时会调用MIME过滤器的ReportData方法，ulProgressMax为文件总是数据量，ulProgress为下载进度，理论上当文件全部下载完后,ulProgress应等于ulProgressMax(实际上，当网页文件不是很大时，即使ulProgress不等于ulProgressMax时，文件也可能全部下载下来)，还有一个反应文件下载情况的参数是grfBSCF。有时，ReportData方法会被Web处理器调用多次。 </div>
<div>
<span>&nbsp;&nbsp;&nbsp; ReportData</span>是过滤网页内容或修改网页内容比较合适的地方。在此地，可以将网页内容通过调用Read保存到自已的缓存或流中并做适当的处理（注意检查字符的编码）。</div>
<div style="text-indent: 21pt;">最后，别忘了调用Web处理器的IInternetProtocolSink::ReportData方法，向它汇报数据下载的情况。Web处理器得到此通知后，就会调用MIME过滤器的IInternetProtocol::Read，此时，你就可以将修改后的数据交给WEB处理器。</div>
<div>
<span>&nbsp;&nbsp;&nbsp; </span>下面的代码示例了如何在ReportData中调用Web处理器的Read预先保存数据：</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CString Ts("");</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; char p[1024];</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; HRESULT hr;</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ULONG Readtotal;</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; do</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; memset(p,0,sizeof(p));</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hr = UrlMonProtocol-&gt;Read(p, sizeof(p)-1, &amp;Readtotal);</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CString pTemp(p);</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Ts=Ts+pTemp;</span>
</div>
<div style="margin: 0cm 0cm 0pt 42pt; text-indent: 21pt;">}while((hr != S_FALSE) &amp;&amp; (hr != INET_E_DOWNLOAD_FAILURE) &amp;&amp; (hr != INET_E_DATA_NOT_AVAILABLE));</div>
<div>&nbsp;</div>
<div>Read成功取得数据一般只返回S_OK或S_FALSE ,返回S_OK表示还有数据，而S_FALSE</div>
<div>表示数据已读取完毕，因此循环的条件设为 hr==S_OK。那A处的条件判断为什么不是</div>
<div>if( hr == S_OK || hr == S_FALSE ) 呢， 因为我发现某些情况下，Read可能返回其</div>
<div>它值，但仍然有成功读取一部分数据出来，数据的大小就是Readtotal指定的值。如果将</div>
<div>那部分数据遗落，网页将无法正常解析。</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>下列代码建立临时文件：</div>
<div style="text-indent: 21pt;">if (CacheFileName == "")</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; TCHAR FName[512];</span>
</div>
<div>CreateUrlCacheEntry(OLE2T(Url), Ts.GetLength(), _T("htm"), FName, 0);</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CFile hFile;</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hFile.Open(FName, CFile::modeCreate|CFile::modeWrite);</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; hFile.Write(Ts,Ts.GetLength());&nbsp;</span>
</div>
<div>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ReportProgress(BINDSTATUS_CACHEFILENAMEAVAILABLE, T2W(FName));</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; } </span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>修改网页代码：</div>
<div style="margin: 0cm 0cm 0pt 42pt; text-indent: 21pt;">Ts.Replace(_T("百度"),_T("千度"));</div>
<div>
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>为浏览器准备好数据：</div>
<div style="margin: 0cm 0cm 0pt 42pt; text-indent: 21pt;">TotalSize= Ts.GetLength() ;</div>
<div style="text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CreateStreamOnHGlobal(0, true, &amp;DataStream);</span>
</div>
<div style="text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; const char * pTs = Ts.GetBuffer(Ts.GetLength());</span>
</div>
<div style="text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ULONG cbWritten;</span>
</div>
<div style="text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DataStream-&gt;Write(pTs,Ts.GetLength(),&amp;cbWritten);</span>
</div>
<div style="text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Ts.ReleaseBuffer();</span>
</div>
<div style="text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; pTs = NULL;</span>
</div>
<div style="text-indent: 21pt;">&nbsp;</div>
<div style="text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ULARGE_INTEGER Dummy;</span>
</div>
<div style="text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; _LARGE_INTEGER zero;</span>
</div>
<div style="text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; zero.QuadPart =0;</span>
</div>
<div style="text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DataStream-&gt;Seek ( zero, STREAM_SEEK_SET, &amp;Dummy);</span>
</div>
<div style="text-indent: 21pt;">&nbsp;</div>
<div>4、IInternetProtocol::Read方法</div>
<div>
<span>&nbsp;&nbsp;&nbsp; </span>该方法由WEB处理器调用来取得浏览器要解析的数据。在上一方法ReportData中</div>
<div>我们已经将所有数据缓存到流中，因此，这里只需将流中的数据返回给WEB处理器。</div>
<div>下面的代码示例了Read中的简单处理：</div>
<div>
<span>&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; DataStream-&gt;Read(pv, cb, pcbRead);</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Written+=*pcbRead;</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (Written == TotalSize)</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return S_FALSE;</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else </span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return S_OK;</span>
</div>
<div style="margin: 0cm 0cm 0pt 21pt; text-indent: 21pt;">
<span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }</span>
</div>
<div>
<span>&nbsp;&nbsp;&nbsp; </span>千万注意，在数据已读取完毕时要返回S_FALSE , 不然可能导致Read被无穷循环调用。处理完这几个方法后，基本是大功造成，其它一些方法处理十分简单，可以参考上面提到的例子。<span>&nbsp;<br><br><a href="http://search.download.csdn.net/search/mimefilter"><strong><font size="3">源代码下载</font></strong></a></span></div>
</span>
</span>
</font>
</font>
</span> <img src ="http://www.cppblog.com/phenix-burn/aggbug/11824.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/phenix-burn/" target="_blank">凤之焚</a> 2006-08-29 16:43 <a href="http://www.cppblog.com/phenix-burn/archive/2006/08/29/11824.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>