C++博客-提琴协奏-随笔分类-设计 Design

C++ library series -- in the MFC multiple-thread environment, how to quit worker-thread safely which begins with AfxBeginThread

flagman — Sun, 11 Dec 2011 12:35:00 GMT

In the MFC environment, normally, thread should be launched with AfxBeginThread for taking usage of MFC multiple-thread mechanism; In such mechanism, those datastructures, such as AFX_MODULE_STATE, would be used by MFC framework to maintain related thread information. It runs well when threads, launched with AfxBeginThread, quit before the main thread, which is responsible for initializing C run-time, but if such main thread quit before any other thread launched by AfxBeginThread, the current application would crash.

Such crash comes from the _afxThreadData (CThreadSlotData* _afxThreadData, which is defined in AFXTLS.cpp as global data structure) has been destructed while the main thread quits and it will invoke related function to clean up global data structures, including _afxThreadData definitely.

Consequently, serious developer should prepare for such case (other worker thread quits before main thread).

The reasonable resolve for such issue, would ensure any other threads should quit before the main thread.

.h file

/////////////////////////////////////////////////////////////////////////////

// CSafeEnterLeaveThread thread

class CSafeEnterLeaveThread : public CWinThread

{

DECLARE_DYNCREATE(CSafeEnterLeaveThread)

protected:

CSafeEnterLeaveThread(); // protected constructor used by dynamic creation

// Attributes

public:

// Operations

public:

// Overrides

// ClassWizard generated virtual function overrides

//{{AFX_VIRTUAL(CSafeEnterLeaveThread)

public:

virtual BOOL InitInstance();

virtual int ExitInstance();

//}}AFX_VIRTUAL

// Implementation

protected:

virtual ~CSafeEnterLeaveThread();

// Generated message map functions

//{{AFX_MSG(CSafeEnterLeaveThread)

// NOTE - the ClassWizard will add and remove member functions here.

//}}AFX_MSG

DECLARE_MESSAGE_MAP()

};

.cpp file

/////////////////////////////////////////////////////////////////////////////

// CSafeEnterLeaveThread

IMPLEMENT_DYNCREATE(CSafeEnterLeaveThread, CWinThread)

CSafeEnterLeaveThread::CSafeEnterLeaveThread()

{

}

CSafeEnterLeaveThread::~CSafeEnterLeaveThread()

{

}

BOOL CSafeEnterLeaveThread::InitInstance()

{

// TODO: perform and per-thread initialization here

ASSERT(this->m_hThread);

CMainApp::RegisterMFCThread(this->m_hThread);

return TRUE;

}

int CSafeEnterLeaveThread::ExitInstance()

{

// TODO: perform any per-thread cleanup here

ASSERT(this->m_hThread);

CMainApp::UnRegisterMFCThread(this->m_hThread);

return CWinThread::ExitInstance();

}

BEGIN_MESSAGE_MAP(CSafeEnterLeaveThread, CWinThread)

//{{AFX_MSG_MAP(CSafeEnterLeaveThread)

// NOTE - the ClassWizard will add and remove mapping macros here.

//}}AFX_MSG_MAP

END_MESSAGE_MAP()

And in the CMainApp,

set g_ThreadHandleSet;

HANDLE g_ThreadHandleArray[MAXIMUM_WAIT_OBJECTS];

CCriticalSection g_csGlobalData;

void CAccgbApp::CheckAllOtherMFCThreadsLeave()

{

int count = g_ThreadHandleSet.size();

if (count == 0) return;

set::iterator it;

int idx = 0;

for (it = g_ThreadHandleSet.begin(); it != g_ThreadHandleSet.end() && idx < MAXIMUM_WAIT_OBJECTS; it++, idx++)

{

g_ThreadHandleArray[idx] = *it;

}

if (count > idx) count = idx;

::WaitForMultipleObjects(count, g_ThreadHandleArray, TRUE, INFINITE);

}

void CAccgbApp::CleanupGlobalData()

{

g_csGlobalData.Lock();

g_ThreadHandleSet.empty();

g_csGlobalData.Unlock();

}

BOOL CAccgbApp::RegisterMFCThread(HANDLE hThread)

{

if (hThread == NULL) return FALSE;

g_csGlobalData.Lock();

if (g_ThreadHandleSet.find(hThread) == g_ThreadHandleSet.end())

g_ThreadHandleSet.insert(hThread);

g_csGlobalData.Unlock();

return TRUE;

}

void CAccgbApp::UnRegisterMFCThread(HANDLE hThread)

{

if (hThread == NULL) return;

g_csGlobalData.Lock();

if (g_ThreadHandleSet.find(hThread) != g_ThreadHandleSet.end())

g_ThreadHandleSet.erase(hThread);

g_csGlobalData.Unlock();

}

flagman 2011-12-11 20:35 发表评论

操作系统怎么根据一个HWND句柄，找到相应的代码

flagman — Mon, 04 Apr 2011 06:16:00 GMT

【在某的大作中提到: 】

: 比如我有一个CMyButton的类，我现在有他的一个handle
: 编译器怎么根据这个句柄找到CMyButton的代码的？

【在某某的大作中提到: 】
: 这个和OS/Compiler没关系，是库起的作用
: 以从某个文章里看的，说MFC用了一个大map，没验证过
: 有本讲GDI的书里，用了WNDCLASS里的extra bytes来实现的这个映射

MFC的应用里，每个MFC线程（必须要使用MFC方式启动的线程）都维护有一个MFC object和HWND之间的

mapping，整个MFC框架就是使用这个机制来实现应用级C++对象和系统级原生窗口内核对象之间的关联；

因为这个mapping是以线程为单位来维护的，每个线程间互不关联，所以，一个应用里对于涉及UI窗口的

任务最好是都放在同一个线程里面，一般就是当前进程的主线程，否则可能出现MFC object和HWND之间

关联不上的问题，而且这样的问题还很隐蔽。

至于WNDCLASS结构自带的extra bytes域，是以前缺乏应用框架的时代，使用Win32 API直接开发时，让每个

窗口类（这里的类，不是C++ class的概念，而是Windows系统窗口定义时的一种数据结构）都能有个附

带一些额外的自定义数据的空间，这个空间往往被用来存放与当前窗口类相关的用户数据，通常是指向

某个内存区域的指针，当程序操作这个属于这个窗口类的窗口时就可以根据这个附带的自定义数据（或

者指针）来操作对应的关联自定义数据；很多后来出现的框架，也都使用了这个extra bytes域，来存放

框架本身的一些和窗口类相关联的数据结构。从目前趋势看，直接使用WNDCLASS以及extra bytes的可能

性是微乎其微了，但是如果要做好原生应用的开发，很多底层的实现细节最要还是要知道一下，以便于

优化结构和性能，以及出错时的调试处理；因为无论是Winform/WPF，还是跨平台的WTL/QT/WxWindows等

等新型的机制或者框架、类库，只要是在Windows平台上搭建的，那都是基于前面说过的这套最基本也是

最核心的Win32 API基础之上。

flagman 2011-04-04 14:16 发表评论

CLR系列--探索SSCLI【1】

flagman — Mon, 13 Dec 2010 01:02:00 GMT

Fusion is one of the most importants features among ones in the runtime implementation of CLI.

In the fusion, or any other components or modules, how to retrieve the execution engine instance and how to generate such engine?

UtilExecutionEngine, implemented as COM object, support Queryinterface/AddRef/Release, and exposed via interface IExecutionEngine.

With SELF_NO_HOST defined,
BYTE g_ExecutionEngineInstance[sizeof(UtilExecutionEngine)];
g_ExecutionEngineInstance would be the singleton instance of current execution engine,

otherwise, without SELF_NO_HOST, the 'sscoree' dll would be loaded and try to get the exported function, which is named 'IEE' from such dll. Here, it is the well-known shim, in .net CLR, such module is named 'mscoree'. Further, if 'IEE' could not be found in such dll, system would try to locate another exported function, named 'LoadLibraryShim', and use such function to load the 'mscorwks' module, and try to locate the 'IEE' exportd functionin it.

It's very obvious that Rotor has implemented its own execution engine, but it also gives or make space for implementation of execution engine from 3rd party. Here, .net CLR is a good candidate definitely, Rotor might load the mscorwks.dll module for its usage.

PAL, PALAPI, for example, HeapAlloc, one famous WIN32 API, has been implemented as one PALAPI (defined in Heap.c), to make it possible that the CLI/Rotor be ported smoothly to other OS, such freebsd/mac os.

CRT routines are also reimplemented, such as memcpy, it has been implemented as GCSafeMemCpy

There're many macros in fuctions, such as SCAN_IGNORE_FAULT/STATIC_CONTRACT_NOTHROW/STATIC_CONTRACT_NOTRIGGER, they are for static analysis tool to scan, analyse and figour out the potential issues in code.

From view point of the execution model by CLI, the act of compiling (including JIT) high-level type descriptions would be separated from the act of turning these type descriptions into processor-specific code and memory structures.

And such executino model, in other word, the well-known 'managed execution', would defer the loading, verification and compilation of components until runtime really needs; At the same time, the type-loading is the key trigger that causes CLI's tool chain to be engaged at runtime. Deferred compilation(lead to JIT)/linking/loading would get better portability to different target platform and be ready for version change; The whole deferred process would driven by well-defined metadata and policy, and it would be very robust for building a virtual execution environment;

At the top of such CLI tool chain, fusion is reponsible for not only finding and binding related assemblies, which are via assembly reference defined in assembly, fusion also takes another important role, loader, and its part of functionality is implemented in PEAssembly, ClassLoader classes. For example, ClassLoader::LoadTypeHandleForTypeKey.

For types in virtual execution environment of CLI, rotor defines four kinds of elements for internal conducting,
ELEMENT_TYPE_CLASS for ordinary classes and generic instantiations(including value types);
ELEMENT_TYPE_ARRAY AND ELEMENT_TYPE_SZARRAY for array types
ELEMENT_TYPE_PRT and ELEMENT_TYPE_BYREF for pointer types
ELEMENT_TYPE_FNPTR for function pointer types

every type would be assigned unique ulong-typed token, and such token would be used to look up in m_TypeDefToMethodTableMap (Linear mapping from TypeDef token to MethodTable *)which is maintained by current module; If there it is, the pointer to method table of such type would be retrieved, or it would look up in the loader module, where the method table should exist in while it's JIT loaded, not launched from NGEN image;

And all the unresolved typed would be maintained in a hash table, PendingTypeLoadTable; Types and only those types that are needed, such as dependencies, including parent types, are loaded in runtime, such type is fully loaded and ready for further execution, and other unresolved types would be kept in the previous hash table.

flagman 2010-12-13 09:02 发表评论

为何C++中的类成员函数没有采用类似Java中的“全虚”设计

flagman — Mon, 13 Dec 2010 00:57:00 GMT

关于程序设计语言本身的设计有许多有趣的话题，比如，为何C++中的类成员函数没有采用类似Java中的“全虚”设计？

1) 从语言本身设计上看，
效率定然是c++当初设计时考虑的重点之一，举个例子，为了节省不必要的VTable开销，ATL用template技术静态转换来模拟动态绑定以支持COM特性的实现；和C的兼容，就VTable角度看，问题不大，因为后者可以用函数指针数组来模拟；

2) 再从大多数应用中常见的类继承体系上看，
除了整个继承体系所统一开放出来的接口集（也就是由虚函数所组成），在继承体系的每个层面另外会有大量的其他辅助成员函数（其数量通常比虚函数多的多），这些成员函数完全没必要设计成虚函数；

3) 从其他语言看，
即使较新的虚拟机语言C#(Java算是较老的虚拟机语言),反而定义了比C++更为严格更为显式的成员方法实现或覆盖或重载或新建的规则；这是非常重要的对C++以及Java设计思想的反思。

4) 从语言的适用场合看，
我们现在的讨论，绝大多数情况下带有一个非常重要的默认前提，那就是在用户态模式下使用C++，如果放宽这个约束，在内核模式下使用C++，那情况又完全不同了。
引用下面这个文档的观点，http://www.microsoft.com/china/whdc/driver/kernel/KMcode.mspx
首先，用户态下非常廉价几乎不用考虑的资源，在内核中是非常昂贵的，比如内核堆栈一般就3个page；

在内核不能分页(paging)时必须保证将被执行的所有代码和数据必须有效的驻留在物理内存中，如果这时需要多驻留几张虚表以及虚表指针那还是显得非常昂贵的，同时编译器为虚函数，模板等生成代码的方式，让开发人员很难确定要执行一个函数所需要的所有代码的所在位置，因此也无法直接控制用于安置这些代码的节（个人认为可能通过progma segment/datasegment/codesegment对于代码和数据进行集中控制），因此在需要这些代码时，可能已经被page out了；

所有涉及类层次结构，模板，异常等等这样的一些语言结构在内核态中都可能是不安全的，最好是把类的使用限定为POD类，回到我们的主题虚函数，也就是说内核态下类设计中没有虚函数。

flagman 2010-12-13 08:57 发表评论

关于系统缓存的问题-物理内存消耗远远多于实际占用物理内存

flagman — Sat, 11 Dec 2010 03:19:00 GMT

【某某提到: 】
: 一台服务器装有windows server 2008 r2，安装16G内存并设置16G虚拟内存。最近在运行一个用C#编写的大规模计算程序时发现，有很大一部分物理内存被莫名其妙地消耗了。资源监视器显示该程序占用物理内存不到5G，但是总的物理内存消耗接近10G，可用物理内存仅剩6G。随着运?
: 除了这个程序之外没有其它程序大量占用内存。这个程序有大量磁盘IO操作，在运行中会不时地调用GC.Collect()以及时清理不用的内存。这个实验中用到的一系列程序的结构基本相同，都会不时调用GC清理，但其它程序的内存使用都正常，只有这个程序会出现占用内存是实际使用的
: 请问为什么会出现这样莫名其妙多占用内存的情况呢？谢谢大家

这个既不是应用本身的bug，也不是系统的memory leak。

当前资源监视器中关于系统物理内存，有这么几个统计项，可用、缓存、总数、已安装；其中"缓存"这项，代表着已用于文件系统、网络等等子系统的数据缓冲存储的内存容量，其中包含数量巨大的驻留在物理内存中的数据页面。而这样的物理内存消耗并没有归入任何一个进程列表显示的进程所占用的物理内存。这就是为什么下面公式，

进程列表显示的所有进程所占用的物理内存之和 + 可用物理内存 < 物理内存总数

，成立的原因所在。

导致这一现象的原因，从这个大规模计算程序的行为描述看，基本可以断定是由于以下两点，
1）应用本身的大规模数据驻留物理内存，导致parser.exe进程庞大的working set；
2）大量频繁的IO操作，引起大量的物理内存为系统缓存所占用；

对于1),必须注意，GC.Collect()只是设置使能垃圾收集的标志位，并没有立即启动垃圾收集过程，这个过程的实际启动时刻由CLR来动态决议；

所以如果要获得即时的托管内存的释放，并进一步释放物理内存以减小当前进程的working set，可以使用AppDomain这个.net下可以用来资源划分、获取和释放的，在概念上近似于轻量级进程的编程语义；在AppDomain中获取的各种资源，包括托管内存、加载其中的各个assembly以及CCW等，在此AppDomain被释放时都被相应的及时释放（或者引用计数递减）。

对于2），重新观察先前的设计实现和模型，考虑是否能把一些分散的IO操作合并起来进行，比如,
for(long i=0; i < Count; ++i)
{
...
objIO.Operation(Data[i], 1);
...
}
修改为
for(long i=0; i < Count; ++i)
{
...
...
}
objIO.Operation(Data, Count);
这样对于提高应用的IO效率以及提升系统缓存利用率应当会有帮助。

对于2），系统缓存随着这个大规模计算应用的进行而逐步增大，并最后导致整个系统无法获取的物理内存而无法继续运行的现象，估计即使采用了在上文提出的，在应用程序代码中尽可能合并IO操作，减少IO次数的方法，也不会改善系统缓存占用物理内存数量过大的问题。这个问题本质上是Windows操作系统本身从NT时代到现在，一直存在的问题，主要是围绕着Windows kernel中的Cache mananger以及memory manager核心态组件的实现机制而产生的。

根据目前的Cc(对Cache manager的简称，在WindowsResourceKernel开源项目中，Cache manager相关模块的函数都以Cc作为前缀，比如CcCopyRead，CcFlushCache等，Memory manager也同样简称Mm)的实现机制，所有对文件系统的访问，包括本地和网络，都会首先由Cc对相关页面作缓存映射，随着频繁的IO的操作，被Cc缓存的页面也迅速递增，而被缓存页面占用多少物理内存，这是由Windows kernel中的Memory manager决定。目前在64位平台上，系统缓存最高可达1TB，所以这个应用进程的运行中出现分配8G的缓存是完全可能的，但同时问题也随之而来，那就是系统缓存占用了过多的物理内存，导致其他进程以及内核本身无法申请足够的物理内存，最后致使系统“僵死”；

对于这个问题，微软提供了“Microsoft Windows Dynamic Cache Service”工具来提供对系统缓存的工作集working set容量（也就是驻留物理内存的大小）的控制，这个工具主要是对SetSystemFileCacheSize的封装，可以设置系统缓存容量的上下限。

但这只是一种临时的解决方案，因为应用虽然可以通过上面这个Dynamic Cache Service来设置和限制系统缓存容量的大小，但是如何确定缓存容量大小的非常困难，如果过小，所有IO性能大受影响，整个Cc如同虚设；如果过大(等价于不受限)，那么系统缓存占用过多物理内存导致系统僵死的现象就会重现。

所以从根本上看，这个问题应由包括Cc和Mm在内的整个Windows kernel作出完整一致的调整，但从目前的实现看要完成整个方案改动很大，据称这个改进可能会考虑包含在Win7中发布。

Microsoft Windows Dynamic Cache Service下载,
http://www.microsoft.com/downloads/en/details.aspx?FamilyID=e24ade0a-5efe-43c8-b9c3-5d0ecb2f39af&displaylang=en

Microsoft Windows Dynamic Cache Service相关的介绍，
http://blogs.msdn.com/b/ntdebugging/archive/2009/02/06/microsoft-windows-dynamic-cache-service.aspx

flagman 2010-12-11 11:19 发表评论

思考系统API设计的问题

flagman — Wed, 01 Dec 2010 13:28:00 GMT

最近正好在思考系统API设计中考量的一些问题，

【某网友讨论到】
: 那地址是不是同一个地址呢。我现在的理解是这样的，假设有巨大的真实内存。windows首先将高2G的内存自己占了，用作各种内核对象。这2G内存共享给每个进程，但进程不能直接访问，只能通过windows给定的函数访问。
: 然后每个进程都给他2G内存，进程如果创建自己的对象就放到自己那2G内存里面，如果要建立内核对象就放到共享的那高2G里面去。
: 所以不同进程如果可以访问高2G内存的话，任何进程访问到同一个高地址实际上都是访问到同一个对象。但如果访问低2G地址的话，不同进程是对应不同的对象的。

在不同的进程中，询问同一个内核对象的实际地址（无论是线性地址还是物理地址），是无意义的：

首先，内核对象只能由在内核态下的例程才能直接访问，在我们日常的代码中，所调用的Windows API，比如CreateFile, （注意调用刚开始时是处于用户态下的），一般都会在ntdll.dll中找到对应的内核函数或例程，接着系统切换到内核态，开始调用实际对应的内核函数(KiCreateFile)，这个时候才会去访问内核对象的实际地址，然后建立一个该内核对象对应当前进程的Handle，并把它返回给caller，同时切换回用户态；因此，对于用户态程序来说，只要且只能知道该内核对象在当前进程中的对应的Handle就可以对其进行操作了；

其次，这样的设计是出于对OS核心数据结构（当然包括我们正在讨论的内核对象）的保护；如果用户态程序可以轻易的获取内核数据结构的实际地址，那么对于整个OS的安全和稳定显然构成很大的问题；一个用户态的误操作可以轻易的引起整个OS的崩溃，而有了这一层的保护，崩溃的只是当前进程而不是整个系统；

接着上面这点，也可以看出，内核对象的如此设计达到了接纳OS本身的平滑演进的目的。从Windows 3.0到95/98，从NT到Win2k/XP，再到眼下的Vista/Win7，Windows操作系统本身发生了巨大的变化和进步，采纳了无数的新技术新方法，但是它基本的系统应用编程接口，也就是我们所熟知的windows API，却并没有发生太大的改变，很多Win 3.0 这个16位OS时代的程序代码只要当初设计规范编码规范，稍许修改就可以在最新版的OS上运行如飞；是什么做到了这些？也就是所谓的极为重要的向后兼容性，我个人认为，把操作系统的重要/主要功能抽象成内核对象，并通过一套极为solid的API暴露出来，达成了这个目标。

这是一种更高层次上的面向对象，把实现的细节，把系统的复杂，简单而优雅的封装了起来。你只要调用CreateFile去建个文件或管道或邮槽，不用担心当前OS是Windows 3.0还是Win7，获得的Handle，你也不用去关心它以及它所指向的内核对象是Windows 3.0的实现还是Win7的实现。

Windows上所有的精彩几乎都是基于这套通过内核对象概念抽象并暴露的API基础之上，COM/OLE，这个二十年前震撼性的ABI和IPC范畴的技术规范，其中很多的设计思路也是植根于内核对象的设计理念，如COM对象的引用计数和内核对象引用计数，IUnknown和Windows Handle(前者是指向某个二进制兼容的组件对象，后者引用或间接指向某个内核对象，都是对于某个复杂概念的一致性抽象表述)，等等；

十年前的.net，本来是作为COM的升级版本推出，把COM/OLE的实现复杂性封装在了虚拟机平台CLR里面，而从这个虚拟机的开源实现SSCLI，我们可以看到大量的COM机制在.net的具体实现里面起了举足轻重的作用。在这些VM中大量symbol有着COR的前缀或者后缀，COR指代什么？Common Object Runtime, 原来CLR/SSCLI的设计思路也是把OS通过虚拟机VM的形式，并通过common object向应用程序暴露功能。

小结一下，
OS内核对象API，三十年前系统级别的对象抽象；
COM/OLE，二十年前二进制组件级别的对象抽象；
.net/CLR, 十年前虚拟机平台级别的对象抽象；

写到这里倒是引起了我其他的一些思考，软件工业界一直以来对面向对象OO是热火朝天，特别是语言层面，从C++/Java/C#到Python/JScript，不一而足；

但是我们有没有从根本性的设计理念上对面向对象，察纳雅言了呢？

如果现在设计Windows这套API的任务放在大家面前，会采用内核对象/Handle方案还是直接指向OS内部数据结构的方式来暴露功能？

从三十年前的这套API的设计中，我们真的可以学到很多。

flagman 2010-12-01 21:28 发表评论