brent's hut

render whole html page

Ask:
Hi,
I'm trying to render the contents of a HTML page hosted in a web
browser control so that I can save the result as an image.

There are a number of ways to do this. For example you can use any of
IHTMLElementRender,­ IViewObject or WM_PRINT. The problem with these
methods is that you only will receive the portion of the page's content
that is visible in the web browser. If the page's size is bigger than
the dimensions of the web browser these hidden or "scrollable" parts
will not get rendered in the above methods.

You could resize the browser to fit the entire page but that is not
feasible in a scenario when a user is using the browser. A second
approach is to load the page into a second hidden browser which, again,
is resized to fit the entire page. The problem with this approach is
that I can't load the document again since it's appearance could
change. I want to render exactly what's in the user's browser. AFAIK
there is know easy way to exactly clone a MSHTML document.

Does anyone (Igor?) have any clues how these "hidden" ares could be
rendered? Any help much appreciated.

Regards,
Christoffer
 
Answer:
    I took a couple of ideas from Code Project and pieced them together to
do just this. I render the client area into a small bitmap, blit it
into a final larger (page size) bitmap and scroll the control to get at
another area. This was simpler than trying to get it to render into
the correct area of the page size bitmap.

This was written for a browser control that was NOT seen by the user so
I didn't care where the final scroll position was.

1. From the IHTMLDocument interface call get_body to get the
IHTMLElement enterface.
2. Get the IHTMLElement2 interface (pBody2 in the code).
3. Call get_scrollHeight and get_scrollWidth, get_clientWidth,
get_clientHeight.
4. Get the IHTMLElementRender interface (pRenderer in the code).
5. Create a bitmap the size of the client area and select it into a
clientDC.
6. Create a bitmap the size of the scroll area and select it into a
pageDC.
7. Use some code like the following to scroll and render the page:

long x = scrollWidth;
long lastX = -1;
bool doneX = false;
while (!doneX)
{
    pBody2->put_scrollLeft(x);
    pBody2->get_scrollLeft(&x);
    if (-1 == lastX)
    lastX = x + clientWidth;
    long y = scrollHeight;
    long lastY = -1;
    bool doneY = false;
    while (!doneY)
    {
        pBody2->put_scrollTop(y);
        pBody2->get_scrollTop(&y);
        if (-1 == lastY)
        lastY = y + clientHeight;
        hr = pRenderer->DrawToDC­(clientDC);
        BitBlt(pageDC,x,y,lastX-x,lastY-y,clientDC,2,2,SRCCOPY);
        doneY = (y==0);
        lastY = y;
        y -= clientHeight-4;
    }
    doneX = (0==x);
    lastX = x;
    x -= clientWidth-4;
}

The pageDC bitmap now holds the full image of the page and you can save
it or do anything else you want. You'll note that what I'm doing is
trying to scroll too far and letting IE scroll appropriately. Also
note that I clip a 2 pixel edge when blitting. This is from actual
testing - needed to not clip the border of the control.

Hope this helps,

David Stidolph
Austin, TX

Ask:
    Thanks for the solution, but it will unfortunately not work in my case
because the control is used by a user and can therefor not be scrolled
programmatically.

My focus now is on trying to clone the MSHTML document and then load
the clone in another hidden browser contol which I can resize to fit
the entire page and then do the rendering. The way I'm currently
cloning the document is saving the HTML to disk and then replacing all
references (images, .js, .css., ...) in the HTML document with local
ones which I have downloaded from the originating server (will do this
as a last resort), the cache or (in the cases where it's possible)
copied directly from the MSHTML document. When that is done I have a
local copy which a can browse to. Does anyone know of a better way to
clone a MSHTML document?

Answer:
    How about using a LockWindowUpdate call before the snapshot and
then unlocking it after? That way, the user never knows the WB
scrolled.

posted on 2007-10-11 23:57 brent 阅读(1213) 评论(0)  编辑 收藏 引用 所属分类: C++Web


只有注册用户登录后才能发表评论。
网站导航: 博客园   IT新闻   BlogJava   知识库   博问   管理