C++博客 :: 首页 :: 新随笔 :: 联系 :: 聚合  :: 管理 ::
  423 随笔 :: 0 文章 :: 454 评论 :: 0 Trackbacks

Building Hybrid Systems with Boost.Python


Author: David Abrahams
Contact: dave@boost-consulting.com
Organization: Boost Consulting
Date: 2003-03-19
Author: Ralf W. Grosse-Kunstleve
Copyright: Copyright David Abrahams and Ralf W. Grosse-Kunstleve 2003. All rights reserved
翻译: 王志勇Slowness Chen金庆
译文更新: 2008-05-29



Boost.Python is an open source C++ library which provides a conciseIDL-like interface for binding C++ classes and functions toPython. Leveraging the full power of C++ compile-time introspectionand of recently developed metaprogramming techniques, this is achievedentirely in pure C++, without introducing a new syntax.Boost.Python's rich set of features and high-level interface make itpossible to engineer packages from the ground up as hybrid systems,giving programmers easy and coherent access to both the efficientcompile-time polymorphism of C++ and the extremely convenient run-timepolymorphism of Python.




Python and C++ are in many ways as different as two languages couldbe: while C++ is usually compiled to machine-code, Python isinterpreted.  Python's dynamic type system is often cited as thefoundation of its flexibility, while in C++ static typing is thecornerstone of its efficiency. C++ has an intricate and difficultcompile-time meta-language, while in Python, practically everythinghappens at runtime.


Yet for many programmers, these very differences mean that Python andC++ complement one another perfectly.  Performance bottlenecks inPython programs can be rewritten in C++ for maximal speed, andauthors of powerful C++ libraries choose Python as a middlewarelanguage for its flexible system integration capabilities.Furthermore, the surface differences mask some strong similarities:


  • 'C'-family control structures (if, while, for...)
  • Support for object-orientation, functional programming, and genericprogramming (these are both multi-paradigm programming languages.)
  • Comprehensive operator overloading facilities, recognizing theimportance of syntactic variability for readability andexpressivity.
  • High-level concepts such as collections and iterators.
  • High-level encapsulation facilities (C++: namespaces, Python: modules)to support the design of re-usable libraries.
  • Exception-handling for effective management of error conditions.
  • C++ idioms in common use, such as handle/body classes andreference-counted smart pointers mirror Python reference semantics.
  • 'C'-家族的控制结构(if, while, for...)
  • 支持面向对象、函数式编程,以及泛型编程(它们都是多范式(multi-paradigm)编程语言。)
  • 认同语法可变性(syntactic variability)对于提高代码可读性和表达力的重要作用,提供了对运算符重载的广泛支持。
  • 高级概念,如集合和迭代器。
  • 高级封装机制(C++:名字空间,Python:模块),以支持可重用库的设计。
  • 异常处理,提供有效的错误管理。
  • 通用的C++惯用法,如handle/body类,和引用计数的智能指针,即Python的引用语义。

Given Python's rich 'C' interoperability API, it should in principlebe possible to expose C++ type and function interfaces to Python withan analogous interface to their C++ counterparts.  However, thefacilities provided by Python alone for integration with C++ arerelatively meager.  Compared to C++ and Python, 'C' has only veryrudimentary abstraction facilities, and support for exception-handlingis completely missing.  'C' extension module writers are required tomanually manage Python reference counts, which is both annoyinglytedious and extremely error-prone. Traditional extension modules alsotend to contain a great deal of boilerplate code repetition whichmakes them difficult to maintain, especially when wrapping an evolvingAPI.


These limitations have lead to the development of a variety of wrappingsystems.  SWIG is probably the most popular package for theintegration of C/C++ and Python. A more recent development is SIP,which was specifically designed for interfacing Python with the Qtgraphical user interface library.  Both SWIG and SIP introduce theirown specialized languages for customizing inter-language bindings.This has certain advantages, but having to deal with three differentlanguages (Python, C/C++ and the interface language) also introducespractical and mental difficulties.  The CXX package demonstrates aninteresting alternative.  It shows that at least some parts ofPython's 'C' API can be wrapped and presented through a much moreuser-friendly C++ interface. However, unlike SWIG and SIP, CXX doesnot include support for wrapping C++ classes as new Python types.

这些限制导致了多种封装系统的发展。SWIG可能是最流行的C/C++和Python集成系统。还有最近发展的SIP,它是专门为Qt图形用户界面库设计的,用于提供Qt的Python接口。为了定制语言间的绑定,SWIG和SIP都引入了它们自己的专用语言。这有一定的好处,但是你不得不去应付三种不同语言(Python、C/C++和接口语言),所以也带来了事实上和心理上的困难。CXX软件包展示了另一种令人感兴趣的选择。它显示,至少可以封装部分Python 'C' API,将它们表示为更友好的C++接口。然而,不像SWIG和SIP,CXX不能将C++类封装成新的Python类型。

The features and goals of Boost.Python overlap significantly withmany of these other systems.  That said, Boost.Python attempts tomaximize convenience and flexibility without introducing a separatewrapping language.  Instead, it presents the user with a high-levelC++ interface for wrapping C++ classes and functions, managing much ofthe complexity behind-the-scenes with static metaprogramming.Boost.Python also goes beyond the scope of earlier systems byproviding:


  • Support for C++ virtual functions that can be overridden in Python.
  • Comprehensive lifetime management facilities for low-level C++pointers and references.
  • Support for organizing extensions as Python packages,with a central registry for inter-language type conversions.
  • A safe and convenient mechanism for tying into Python's powerfulserialization engine (pickle).
  • Coherence with the rules for handling C++ lvalues and rvalues thatcan only come from a deep understanding of both the Python and C++type systems.
  • 支持C++虚函数,并能在Python中覆盖。
  • 对于低级的C++指针和引用,提供全面的生命期管理机制。
  • 支持按Python包组织扩展模块,通过中心注册表进行语言间类型转换。
  • 通过一种安全方便的机制,引入Python强大的序列化引擎(pickle)。
  • 与C++处理左值和右值的规则相一致,该一致性只能来自于对Python和C++类型系统的深入理解。

The key insight that sparked the development of Boost.Python is thatmuch of the boilerplate code in traditional extension modules could beeliminated using C++ compile-time introspection.  Each argument of awrapped C++ function must be extracted from a Python object using aprocedure that depends on the argument type.  Similarly the function'sreturn type determines how the return value will be converted from C++to Python.  Of course argument and return types are part of eachfunction's type, and this is exactly the source from whichBoost.Python deduces most of the information required.


This approach leads to user guided wrapping: as much information isextracted directly from the source code to be wrapped as is possiblewithin the framework of pure C++, and some additional information issupplied explicitly by the user.  Mostly the guidance is mechanicaland little real intervention is required.  Because the interfacespecification is written in the same full-featured language as thecode being exposed, the user has unprecedented power available whenshe does need to take control.

这种方法导致了“用户指导的封装(user guided wrapping)”:在纯C++的框架内,从待封装的源代码中直接提取尽可能多的信息,而一些额外的信息由用户显式提供。通常这种指导是自动的,很少需要真正的干涉。因为接口规范和导出代码是用同一门全功能的语言写的,当用户确实需要取得控制时,他所拥有的权力是空前强大的。

Boost.Python Design Goals


The primary goal of Boost.Python is to allow users to expose C++classes and functions to Python using nothing more than a C++compiler.  In broad strokes, the user experience should be one ofdirectly manipulating C++ objects from Python.


However, it's also important not to translate all interfaces tooliterally: the idioms of each language must be respected.  Forexample, though C++ and Python both have an iterator concept, they areexpressed very differently.  Boost.Python has to be able to bridge theinterface gap.


It must be possible to insulate Python users from crashes resultingfrom trivial misuses of C++ interfaces, such as accessingalready-deleted objects.  By the same token the library shouldinsulate C++ users from low-level Python 'C' API, replacingerror-prone 'C' interfaces like manual reference-count management andraw PyObject pointers with more-robust alternatives.

Python用户可能会误用C++接口,因此,Boost.Python必须能够隔离因轻微的误用而造成的崩溃,例如访问已删除的对象。同样的,Boost.Python库应该把C++用户从低级的Python 'C' API中解放出来,将容易出错的'C'接口,如手工引用计数管理、原始的PyObject指针,替换为更健壮的接口。

Support for component-based development is crucial, so that C++ typesexposed in one extension module can be passed to functions exposed inanother without loss of crucial information like C++ inheritancerelationships.


Finally, all wrapping must be non-intrusive, without modifying oreven seeing the original C++ source code.  Existing C++ libraries haveto be wrappable by third parties who only have access to header filesand binaries.


Hello Boost.Python World

Hello Boost.Python World

And now for a preview of Boost.Python, and how it improves on the rawfacilities offered by Python. Here's a function we might want toexpose:


char const* greet(unsigned x)
   static char const* const msgs[] = { "hello", "Boost.Python", "world!" };

   if (x > 2)
       throw std::range_error("greet: index out of range");

   return msgs[x];

To wrap this function in standard C++ using the Python 'C' API, we'dneed something like this:

在标准C++中,用Python 'C' API来封装这个函数,我们需要像这样做:

extern "C" // all Python interactions use 'C' linkage and calling convention
    // Wrapper to handle argument/result conversion and checking
    PyObject* greet_wrap(PyObject* args, PyObject * keywords)
         int x;
         if (PyArg_ParseTuple(args, "i", &x))    // extract/check arguments
             char const* result = greet(x);      // invoke wrapped function
             return PyString_FromString(result); // convert result to Python
         return 0;                               // error occurred

    // Table of wrapped functions to be exposed by the module
    static PyMethodDef methods[] = {
        { "greet", greet_wrap, METH_VARARGS, "return one of 3 parts of a greeting" }
        , { NULL, NULL, 0, NULL } // sentinel

    // module initialization function
    DL_EXPORT init_hello()
        (void) Py_InitModule("hello", methods); // add the methods to the module

Now here's the wrapping code we'd use to expose it with Boost.Python:


#include <boost/python.hpp>
using namespace boost::python;
    def("greet", greet, "return one of 3 parts of a greeting");

and here it is in action:


>>> import hello
>>> for x in range(3):
...     print hello.greet(x)

Aside from the fact that the 'C' API version is much more verbose,it's worth noting a few things that it doesn't handle correctly:

使用'C' API的版本要冗长的多,此外,还需要注意,有些东西它没有正确处理:

  • The original function accepts an unsigned integer, and the Python'C' API only gives us a way of extracting signed integers. TheBoost.Python version will raise a Python exception if we try to passa negative number to hello.greet, but the other one will proceedto do whatever the C++ implementation does when converting annegative integer to unsigned (usually wrapping to some very largenumber), and pass the incorrect translation on to the wrappedfunction.

    原来的函数接受一个无符号整数,然而Python 'C' API只能提取有符号整数。如果我们试图向hello.greet传递一个负数,Boost.Python版会引发Python异常,而另一个则会继续:执行C++代码,将负数转换为无符号数(通常会变成一个很大的数),然后把不正确的转换结果传递给被封装的函数。

  • That brings us to the second problem: if the C++ greet()function is called with a number greater than 2, it will throw anexception.  Typically, if a C++ exception propagates across theboundary with code generated by a 'C' compiler, it will cause acrash.  As you can see in the first version, there's no C++scaffolding there to prevent this from happening.  Functions wrappedby Boost.Python automatically include an exception-handling layerwhich protects Python users by translating unhandled C++ exceptionsinto a corresponding Python exception.

    这引起了第二个问题:如果输入一个大于2的参数,C++ greet()函数会抛出异常。典型的,如果C++异常传播时,跨越了'C'编译器生成的代码的边界,就会导致崩溃。正如你在第一个版本中所见,那儿没有防止崩溃的C++机制。而Boost.Python封装的函数自动包含了异常处理层,它把未处理的C++异常翻译成相应的Python异常,从而保护了Python用户。

  • A slightly more-subtle limitation is that the argument conversionused in the Python 'C' API case can only get that integer x inone way.  PyArg_ParseTuple can't convert Python long objects(arbitrary-precision integers) which happen to fit in an unsignedint but not in a signed long, nor will it ever handle awrapped C++ class with a user-defined implicit operator unsignedint() conversion. Boost.Python's dynamic type conversionregistry allows users to add arbitrary conversion methods.

    一个更微妙的限制是,Python 'C' API的参数转换只能以“一种”方式取得整数x。如果有一个Python long对象(任意精度整数),它的大小正好属于unsigned int,但不属于signed long,PyArg_ParseTuple就不能对其进行转换。对于一个定义了operator unsigned int(),即用户自定义隐式转换的C++封装类,它同样无法处理。而Boost.Python的动态类型转换注册表允许用户添加任意的转换方法。

Library Overview


This section outlines some of the library's major features.  Except asneccessary to avoid confusion, details of library implementation areomitted.


Exposing Classes


C++ classes and structs are exposed with a similarly-terse interface.Given:


struct World
    void set(std::string msg) { this->msg = msg; }
    std::string greet() { return msg; }
    std::string msg;

The following code will expose it in our extension module:


#include <boost/python.hpp>
        .def("greet", &World::greet)
        .def("set", &World::set)

Although this code has a certain pythonic familiarity, peoplesometimes find the syntax bit confusing because it doesn't look likemost of the C++ code they're used to. All the same, this is juststandard C++.  Because of their flexible syntax and operatoroverloading, C++ and Python are great for defining domain-specific(sub)languages(DSLs), and that's what we've done in Boost.Python. To break it down:

尽管上述代码具有某种熟悉的Python风格,但语法还是有点令人迷惑,因为它看起来不像通常的C++代码。但是,这仍然是正确的标准C++。因为C++和Python具有灵活的语法和运算符重载,它们都很善于定义特定领域(子)语言(DSLs, domain-specific (sub)languages)。我们在Boost.Python里面就是定义了一个DSL。把代码拆开来看:


constructs an unnamed object of type class_<World> and passes"World" to its constructor.  This creates a new-style Python classcalled World in the extension module, and associates it with theC++ type World in the Boost.Python type conversion registry.  Wemight have also written:


class_<World> w("World");

but that would've been more verbose, since we'd have to name wagain to invoke its def() member function:


w.def("greet", &World::greet)

There's nothing special about the location of the dot for memberaccess in the original example: C++ allows any amount of whitespace oneither side of a token, and placing the dot at the beginning of eachline allows us to chain as many successive calls to member functionsas we like with a uniform syntax.  The other key fact that allowschaining is that class_<> member functions all return a referenceto *this.


So the example is equivalent to:


class_<World> w("World");
w.def("greet", &World::greet);
w.def("set", &World::set);

It's occasionally useful to be able to break down the components of aBoost.Python class wrapper in this way, but the rest of this articlewill stick to the terse syntax.


For completeness, here's the wrapped class in use:


>>> import hello
>>> planet = hello.World()
>>> planet.set('howdy')
>>> planet.greet()



Since our World class is just a plain struct, it has animplicit no-argument (nullary) constructor.  Boost.Python exposes thenullary constructor by default, which is why we were able to write:


>>> planet = hello.World()

However, well-designed classes in any language may require constructorarguments in order to establish their invariants.  Unlike Python,where __init__ is just a specially-named method, In C++constructors cannot be handled like ordinary member functions.  Inparticular, we can't take their address: &World::World is anerror.  The library provides a different interface for specifyingconstructors.  Given:


struct World
    World(std::string msg); // added constructor

we can modify our wrapping code as follows:


class_<World>("World", init<std::string>())

of course, a C++ class may have additional constructors, and we canexpose those as well by passing more instances of init<...> todef():


class_<World>("World", init<std::string>())
    .def(init<double, double>())

Boost.Python allows wrapped functions, member functions, andconstructors to be overloaded to mirror C++ overloading.


Data Members and Properties


Any publicly-accessible data members in a C++ class can be easilyexposed as either readonly or readwrite attributes:


class_<World>("World", init<std::string>())
    .def_readonly("msg", &World::msg)

and can be used directly in Python:


>>> planet = hello.World('howdy')
>>> planet.msg

This does not result in adding attributes to the World instance__dict__, which can result in substantial memory savings whenwrapping large data structures.  In fact, no instance __dict__will be created at all unless attributes are explicitly added fromPython. Boost.Python owes this capability to the new Python 2.2 typesystem, in particular the descriptor interface and property type.

不会World实例__dict__中添加属性,从而在封装大型数据结构时节省大量的内存。实际上,根本不会创建实例__dict__,除非从Python显式添加属性。Boost.Python的这种能力归功于Python 2.2新的类型系统,尤其是描述符(descriptor)接口和property类型。

In C++, publicly-accessible data members are considered a sign of poordesign because they break encapsulation, and style guides usuallydictate the use of "getter" and "setter" functions instead.  InPython, however, __getattr__, __setattr__, and since 2.2,property mean that attribute access is just one morewell-encapsulated syntactic tool at the programmer's disposal.Boost.Python bridges this idiomatic gap by making Python propertycreation directly available to users.  If msg were private, wecould still expose it as attribute in Python as follows:

在C++中,人们认为,可公有访问的数据成员是设计糟糕的标志,因为它们破坏了封装性,并且风格指南通常指示使用“getter”和“setter”函数作为替代。然而在Python里,__getattr____setattr__,和2.2版出现的property意味着,属性访问仅仅是一种任由程序员选用的、封装性更好的语法工具。Boost.Python让用户可直接创建Python property,从而消除了二者语言习惯上的差异。即使msg是私有的,我们仍可把它导出为Python中的属性,如下:

class_<World>("World", init<std::string>())
    .add_property("msg", &World::greet, &World::set)

The example above mirrors the familiar usage of properties in Python2.2+:

上例等同于Python 2.2+里面熟悉的属性的用法:

>>> class World(object):
...     __init__(self, msg):
...         self.__msg = msg
...     def greet(self):
...         return self.__msg
...     def set(self, msg):
...         self.__msg = msg
...     msg = property(greet, set)

Operator Overloading


The ability to write arithmetic operators for user-defined types hasbeen a major factor in the success of both languages for numericalcomputation, and the success of packages like NumPy attests to thepower of exposing operators in extension modules.  Boost.Pythonprovides a concise mechanism for wrapping operator overloads. Theexample below shows a fragment from a wrapper for the Boost rationalnumber library:


class_<rational<int> >("rational_int")
  .def(init<int, int>()) // constructor, e.g. rational_int(3,4)
  .def("numerator", &rational<int>::numerator)
  .def("denominator", &rational<int>::denominator)
  .def(-self)        // __neg__ (unary minus)
  .def(self + self)  // __add__ (homogeneous)
  .def(self * self)  // __mul__
  .def(self + int()) // __add__ (heterogenous)
  .def(int() + self) // __radd__

The magic is performed using a simplified application of "expressiontemplates" [VELD1995], a technique originally developed foroptimization of high-performance matrix algebra expressions.  Theessence is that instead of performing the computation immediately,operators are overloaded to construct a type representing thecomputation.  In matrix algebra, dramatic optimizations are oftenavailable when the structure of an entire expression can be taken intoaccount, rather than evaluating each operation "greedily".Boost.Python uses the same technique to build an appropriate Pythonmethod object based on expressions involving self.

魔法的施展只是简单应用了“表达式模板(expression templates)”[VELD1995],一种最初为高性能矩阵代数表达式优化而开发的技术。其精髓是,不是立即进行计算,而是利用运算符重载,来构造一个代表计算的类型。在矩阵代数里,当考虑整个表达式的结构,而不是“贪婪地”对每步运算求值时,经常可以获得显著的优化。Boost.Python使用了同样的技术,它用包含self的表达式,构建了一个适当的Python成员方法对象。



C++ inheritance relationships can be represented to Boost.Python by addingan optional bases<...> argument to the class_<...> templateparameter list as follows:


class_<Derived, bases<Base1,Base2> >("Derived")

This has two effects:


  1. When the class_<...> is created, Python type objectscorresponding to Base1 and Base2 are looked up inBoost.Python's registry, and are used as bases for the new PythonDerived type object, so methods exposed for the Python Base1and Base2 types are automatically members of the Derivedtype.  Because the registry is global, this works correctly even ifDerived is exposed in a different module from either of itsbases.
  2. C++ conversions from Derived to its bases are added to theBoost.Python registry.  Thus wrapped C++ methods expecting (apointer or reference to) an object of either base type can becalled with an object wrapping a Derived instance.  Wrappedmember functions of class T are treated as though they have animplicit first argument of T&, so these conversions areneccessary to allow the base class methods to be called for derivedobjects.
  1. class_<...>创建时,会在Boost.Python的注册表里查找Base1Base2所对应的Python类型对象,并将它们作为新的Python Derived类型对象的基类,因此为Python的Base1Base2类型导出的成员函数自动成为Derived类型的成员。因为注册表是全局的,所以Derived和它的基类可以在不同的模块中导出。
  2. 在Boost.Python的注册表里,添加了从Derived到它的基类的C++转换。这样,封装了Derived实例的对象就可以调用其基类的方法,而该封装的C++方法本该由一个基类对象(指针或引用)来调用。类T的成员方法封装后,可视为它们具有一个隐含的第一参数T&,所以为了允许派生类对象调用基类方法,这些转换是必须的。

Of course it's possible to derive new Python classes from wrapped C++class instances.  Because Boost.Python uses the new-style classsystem, that works very much as for the Python built-in types.  Thereis one significant detail in which it differs: the built-in typesgenerally establish their invariants in their __new__ function, sothat derived classes do not need to call __init__ on the baseclass before invoking its methods :


>>> class L(list):
...      def __init__(self):
...          pass
>>> L().reverse()

Because C++ object construction is a one-step operation, C++ instancedata cannot be constructed until the arguments are available, in the__init__ function:


>>> class D(SomeBoostPythonClass):
...      def __init__(self):
...          pass
>>> D().some_boost_python_method()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: bad argument type for built-in operation

This happened because Boost.Python couldn't find instance data of typeSomeBoostPythonClass within the D instance; D's __init__function masked construction of the base class.  It could be correctedby either removing D's __init__ function or having it callSomeBoostPythonClass.__init__(...) explicitly.


Virtual Functions


Deriving new types in Python from extension classes is not veryinteresting unless they can be used polymorphically from C++.  Inother words, Python method implementations should appear to overridethe implementation of C++ virtual functions when called through baseclass pointers/references from C++.  Since the only way to alter thebehavior of a virtual function is to override it in a derived class,the user must build a special derived class to dispatch a polymorphicclass' virtual functions:


// interface to wrap:
class Base
    virtual int f(std::string x) { return 42; }
    virtual ~Base();

int calls_f(Base const& b, std::string x) { return b.f(x); }

// Wrapping Code

// Dispatcher class
struct BaseWrap : Base
    // Store a pointer to the Python object
    BaseWrap(PyObject* self_) : self(self_) {}
    PyObject* self;

    // Default implementation, for when f is not overridden
    int f_default(std::string x) { return this->Base::f(x); }
    // Dispatch implementation
    int f(std::string x) { return call_method<int>(self, "f", x); }

    def("calls_f", calls_f);
    class_<Base, BaseWrap>("Base")
        .def("f", &Base::f, &BaseWrap::f_default)

Now here's some Python code which demonstrates:


>>> class Derived(Base):
...     def f(self, s):
...          return len(s)
>>> calls_f(Base(), 'foo')
>>> calls_f(Derived(), 'forty-two')

Things to notice about the dispatcher class:


  • The key element which allows overriding in Python is thecall_method invocation, which uses the same global typeconversion registry as the C++ function wrapping does to convert itsarguments from C++ to Python and its return type from Python to C++.
  • Any constructor signatures you wish to wrap must be replicated withan initial PyObject* argument
  • The dispatcher must store this argument so that it can be used toinvoke call_method
  • The f_default member function is needed when the function beingexposed is not pure virtual; there's no other way Base::f can becalled on an object of type BaseWrap, since it overrides f.
  • 允许在Python里覆盖的关键因素是call_method调用,与C++函数封装一样,它使用同一个全局注册表,把参数从C++转换到Python,并把返回类型从Python转换到C++。
  • 任何你希望封装的构造函数,其函数签名必须有一个的相同的初始化参数PyObject*
  • 分派者必须保存这个参数,以便调用call_method时使用。
  • 当导出的函数不是纯虚函数时,就需要f_default成员函数;在BaseWrap类型的对象里,没有其他方式可以调用Base::f,因为f被覆盖了。

Deeper Reflection on the Horizon?


Admittedly, this formula is tedious to repeat, especially on a projectwith many polymorphic classes.  That it is neccessary reflects somelimitations in C++'s compile-time introspection capabilities: there'sno way to enumerate the members of a class and find out which arevirtual functions.  At least one very promising project has beenstarted to write a front-end which can generate these dispatchers (andother wrapping code) automatically from C++ headers.


Pyste is being developed by Bruno da Silva de Oliveira.  It builds onGCC_XML, which generates an XML version of GCC's internal programrepresentation.  Since GCC is a highly-conformant C++ compiler, thisensures correct handling of the most-sophisticated template code andfull access to the underlying type system.  In keeping with theBoost.Python philosophy, a Pyste interface description is neitherintrusive on the code being wrapped, nor expressed in some unfamiliarlanguage: instead it is a 100% pure Python script.  If Pyste issuccessful it will mark a move away from wrapping everything directlyin C++ for many of our users.  It will also allow us the choice toshift some of the metaprogram code from C++ to Python.  We expect thatsoon, not only our users but the Boost.Python developers themselveswill be "thinking hybrid" about their own code.

Bruno da Silva de Oliveira正在开发Pyste。Pyste基于GCC_XML构建,而GCC_XML可以生成XML版本的GCC内部程序描述。因为GCC是一种高度兼容标准的C++编译器,从而确保了对最复杂的模板代码的正确处理,和对底层类型系统的完全访问。和Boost.Python的哲学一致,Pyste接口描述既不侵入待封装的代码,也不使用某种不熟悉的语言来表达,相反,它是100%的纯Python脚本。如果Pyste成功的话,它将标志,我们的许多用户不必直接用C++封装所有东西。Pyste也将允许我们选择性地把一些元编程代码从C++转移到Python。我们期待不久以后,不仅用户,而且Boost.Python开发者也能,“混合地思考”他们自己的代码。(译注:Pyste已不再维护,更新的是Py++。)



Serialization is the process of converting objects in memory to aform that can be stored on disk or sent over a network connection. Theserialized object (most often a plain string) can be retrieved andconverted back to the original object. A good serialization system willautomatically convert entire object hierarchies. Python's standardpickle module is just such a system.  It leverages the language's strongruntime introspection facilities for serializing practically arbitraryuser-defined objects. With a few simple and unintrusive provisions thispowerful machinery can be extended to also work for wrapped C++ objects.Here is an example:


#include <string>

struct World
    World(std::string a_msg) : msg(a_msg) {}
    std::string greet() const { return msg; }
    std::string msg;

#include <boost/python.hpp>
using namespace boost::python;

struct World_picklers : pickle_suite
  static tuple
  getinitargs(World const& w) { return make_tuple(w.greet()); }

    class_<World>("World", init<std::string>())
        .def("greet", &World::greet)

Now let's create a World object and put it to rest on disk:


>>> import hello
>>> import pickle
>>> a_world = hello.World("howdy")
>>> pickle.dump(a_world, open("my_world", "w"))

In a potentially different script on a potentially differentcomputer with a potentially different operating system:


>>> import pickle
>>> resurrected_world = pickle.load(open("my_world", "r"))
>>> resurrected_world.greet()

Of course the cPickle module can also be used for fasterprocessing.


Boost.Python's pickle_suite fully supports the pickle protocoldefined in the standard Python documentation. Like a __getinitargs__function in Python, the pickle_suite's getinitargs() is responsible forcreating the argument tuple that will be use to reconstruct the pickledobject.  The other elements of the Python pickling protocol,__getstate__ and __setstate__ can be optionally provided via C++getstate and setstate functions.  C++'s static type system allows thelibrary to ensure at compile-time that nonsensical combinations offunctions (e.g. getstate without setstate) are not used.

Boost.Python的pickle_suite完全支持标准Python文档定义的pickle协议。类似Python里的__getinitargs__函数,pickle_suite的getinitargs()负责创建参数元组,以重建pickle的对象。 Python pickle协议中的其他元素,__getstate__和__setstate__,可以通过C++ getstate和setstate函数来提供,也可以不提供。利用C++的静态类型系统,Boost.Python库在编译时保证,不会使用没有意义的函数组合(例如,有getstate无setstate)。

Enabling serialization of more complex C++ objects requires a littlemore work than is shown in the example above. Fortunately theobject interface (see next section) greatly helps in keeping thecode manageable.


Object interface


Experienced 'C' language extension module authors will be familiarwith the ubiquitous PyObject*, manual reference-counting, and theneed to remember which API calls return "new" (owned) references or"borrowed" (raw) references.  These constraints are not justcumbersome but also a major source of errors, especially in thepresence of exceptions.


Boost.Python provides a class object which automates referencecounting and provides conversion to Python from C++ objects ofarbitrary type.  This significantly reduces the learning effort forprospective extension module writers.


Creating an object from any other type is extremely simple:


object s("hello, world");  // s manages a Python string

object has templated interactions with all other types, withautomatic to-python conversions. It happens so naturally that it'seasily overlooked:


object ten_Os = 10 * s[4]; // -> "oooooooooo"

In the example above, 4 and 10 are converted to Python objectsbefore the indexing and multiplication operations are invoked.


The extract<T> class template can be used to convert Python objectsto C++ types:


double x = extract<double>(o);

If a conversion in either direction cannot be performed, anappropriate exception is thrown at runtime.


The object type is accompanied by a set of derived typesthat mirror the Python built-in types such as list, dict,tuple, etc. as much as possible. This enables convenientmanipulation of these high-level types from C++:


dict d;
d["some"] = "thing";
d["lucky_number"] = 13;
list l = d.keys();

This almost looks and works like regular Python code, but it is pureC++.  Of course we can wrap C++ functions which accept or returnobject instances.


Thinking hybrid


Because of the practical and mental difficulties of combiningprogramming languages, it is common to settle a single language at theoutset of any development effort.  For many applications, performanceconsiderations dictate the use of a compiled language for the corealgorithms.  Unfortunately, due to the complexity of the static typesystem, the price we pay for runtime performance is often asignificant increase in development time.  Experience shows thatwriting maintainable C++ code usually takes longer and requires farmore hard-earned working experience than developing comparable Pythoncode.  Even when developers are comfortable working exclusively incompiled languages, they often augment their systems by some type ofad hoc scripting layer for the benefit of their users without everavailing themselves of the same advantages.


Boost.Python enables us to think hybrid.  Python can be used forrapidly prototyping a new application; its ease of use and the largepool of standard libraries give us a head start on the way to aworking system.  If necessary, the working code can be used todiscover rate-limiting hotspots.  To maximize performance these canbe reimplemented in C++, together with the Boost.Python bindingsneeded to tie them back into the existing higher-level procedure.

Boost.Python让我们可以混合地思考(think hybrid)。Python可以为一个新应用快速搭建原型;在建立一个可运行的系统时,它的易用性和一大堆标准库让我们处于领先。如果有必要,可以用运行的代码来揭示限制速度的热点。为了提高性能,这些热点可以用C++来重新实现,然后用Boost.Python绑定,并提供给现有的高级过程调用。

Of course, this top-down approach is less attractive if it is clearfrom the start that many algorithms will eventually have to beimplemented in C++.  Fortunately Boost.Python also enables us topursue a bottom-up approach.  We have used this approach verysuccessfully in the development of a toolbox for scientificapplications.  The toolbox started out mainly as a library of C++classes with Boost.Python bindings, and for a while the growth wasmainly concentrated on the C++ parts.  However, as the toolbox isbecoming more complete, more and more newly added functionality can beimplemented in Python.



This figure shows the estimated ratio of newly added C++ and Pythoncode over time as new algorithms are implemented.  We expect thisratio to level out near 70% Python.  Being able to solve new problemsmostly in Python rather than a more difficult statically typedlanguage is the return on our investment in Boost.Python.  The abilityto access all of our code from Python allows a broader group ofdevelopers to use it in the rapid development of new applications.


Development history


The first version of Boost.Python was developed in 2000 by DaveAbrahams at Dragon Systems, where he was privileged to have Tim Petersas a guide to "The Zen of Python".  One of Dave's jobs was to developa Python-based natural language processing system.  Since it waseventually going to be targeting embedded hardware, it was alwaysassumed that the compute-intensive core would be rewritten in C++ tooptimize speed and memory footprint 1.  The project also wanted totest all of its C++ code using Python test scripts 2.  The onlytool we knew of for binding C++ and Python was SWIG, and at the timeits handling of C++ was weak.  It would be false to claim any deepinsight into the possible advantages of Boost.Python's approach atthis point.  Dave's interest and expertise in fancy C++ templatetricks had just reached the point where he could do some real damage,and Boost.Python emerged as it did because it filled a need andbecause it seemed like a cool thing to try.

Boost.Python的第一版是由Dragon Systems的Dave Abrahams在2000年开发的,在Dragon Systems,Dave有幸由Tim Peters引导,接受了“Python之禅(The Zen of Python)”。Dave的工作之一是,开发基于Python的自然语言处理系统(NLP,natural language processing)。由于最终要用于嵌入式硬件,所以总是假设,计算密集的内核将会用C++来重写,以优化速度和内存占用1。这个项目也想用Python测试脚本来测试所有的C++代码2。当时,我们所知的绑定C++和Python的唯一工具是SWIG,但那时它处理C++的能力比较弱。如果说在那时就有什么深知卓见,说Boost.Python的方法会有何等优越性,那是骗人的。那时,Dave正好对花俏的C++模板技巧感兴趣,并且娴熟到刚好能真正做点东西,Boost.Python就那样出现了,因为它满足了需求,因为它看起来挺酷,值得一试。

This early version was aimed at many of the same basic goals we'vedescribed in this paper, differing most-noticeably by having aslightly more cumbersome syntax and by lack of special support foroperator overloading, pickling, and component-based development.These last three features were quickly added by Ullrich Koethe andRalf Grosse-Kunstleve 3, and other enthusiastic contributors arrivedon the scene to contribute enhancements like support for nestedmodules and static member functions.

这个早期版本针对的目标,与我们在本文所述的许多基本目标相同,最显著的区别在于,语法要稍微麻烦一点,并且,对运算符重载、pickling,和基于组件的开发缺乏专门的支持。后面这三个特性很快就由Ullrich Koethe和Ralf Grosse-Kunstleve加上了3,并且,其他热心的贡献者也出现了,并作了一些改进,如对嵌套模块和静态成员函数的支持等。

By early 2001 development had stabilized and few new features werebeing added, however a disturbing new fact came to light: Ralf hadbegun testing Boost.Python on pre-release versions of a compiler usingthe EDG front-end, and the mechanism at the core of Boost.Pythonresponsible for handling conversions between Python and C++ types wasfailing to compile.  As it turned out, we had been exploiting a verycommon bug in the implementation of all the C++ compilers we hadtested.  We knew that as C++ compilers rapidly became morestandards-compliant, the library would begin failing on moreplatforms.  Unfortunately, because the mechanism was so central to thefunctioning of the library, fixing the problem looked very difficult.


Fortunately, later that year Lawrence Berkeley and later LawrenceLivermore National labs contracted with Boost Consulting for supportand development of Boost.Python, and there was a new opportunity toaddress fundamental issues and ensure a future for the library.  Aredesign effort began with the low level type conversion architecture,building in standards-compliance and support for component-baseddevelopment (in contrast to version 1 where conversions had to beexplicitly imported and exported across module boundaries).  A newanalysis of the relationship between the Python and C++ objects wasdone, resulting in more intuitive handling for C++ lvalues andrvalues.

幸运的是,那一年末,Lawrence Berkeley,后来建立了Lawrence Livermore National labs,与Boost Consulting签订了合同,来支持和发展Boost.Python,这样就有了一个新的机会来处理库的基本问题,从而确保了库未来的发展。库进行了重新设计,开始于底层的类型转换架构,使它内置具有标准兼容性,并支持基于组件的开发(第1版中,转换必须显式地在模块间导入和导出)。对Python和C++对象的关系进行了新的分析,从而能更直观地处理C++左值和右值。

The emergence of a powerful new type system in Python 2.2 made thechoice of whether to maintain compatibility with Python 1.5.2 easy:the opportunity to throw away a great deal of elaborate code foremulating classic Python classes alone was too good to pass up.  Inaddition, Python iterators and descriptors provided crucial andelegant tools for representing similar C++ constructs.  Thedevelopment of the generalized object interface allowed us tofurther shield C++ programmers from the dangers and syntactic burdensof the Python 'C' API.  A great number of other features including C++exception translation, improved support for overloaded functions, andmost significantly, CallPolicies for handling pointers andreferences, were added during this period.

关于是否维护对Python 1.5.2的兼容性,因为Python 2.2里出现了一个强大的新的类型系统,选择变得容易了:这个机会好的令人无法拒绝,籍此可以抛弃大量复杂精细的代码,而这些代码仅仅是用来模拟传统的Python类。另外,Python的迭代器(iterator)和描述符(descriptor)提供了重要且优雅的工具,用来表示类似的C++构造。通用的object接口的开发进一步方便了C++程序员,免除了Python 'C' API的危险性和语法负担。这一阶段,还添加了大量其他特性,包括C++异常翻译,对函数重载的更好的支持,还有最重要的,用来处理指针和引用的CallPolicies。

In October 2002, version 2 of Boost.Python was released.  Developmentsince then has concentrated on improved support for C++ runtimepolymorphism and smart pointers.  Peter Dimov's ingeniousboost::shared_ptr design in particular has allowed us to give thehybrid developer a consistent interface for moving objects back andforth across the language barrier without loss of information.  Atfirst, we were concerned that the sophistication and complexity of theBoost.Python v2 implementation might discourage contributors, but theemergence of Pyste and several other significant featurecontributions have laid those fears to rest.  Daily questions on thePython C++-sig and a backlog of desired improvements show that thelibrary is getting used.  To us, the future looks bright.

2002年十月,Boost.Python第2版发布了。从那以后,开发集中于更好地支持C++运行时多态性和智能指针。特别是Peter Dimov巧妙的boost::shared_ptr 的设计,使我们能给混和系统开发者提供一个一致的接口,用于跨越语言屏障来回移动对象而不丢失信息。刚开始,我们担心Boost.Python v2实现的诡秘与复杂会阻碍贡献者,但Pyste的出现,和其他几个重要特性的贡献,证明那些担心是多余的。在Python C++-sig上每天的提问,和积压的改进请求表明了库正在被使用。对我们来说,未来是光明的。



Boost.Python achieves seamless interoperability between two rich andcomplimentary language environments.  Because it leverages templatemetaprogramming to introspect about types and functions, the usernever has to learn a third syntax: the interface definitions arewritten in concise and maintainable C++.  Also, the wrapping systemdoesn't have to parse C++ headers or represent the type system: thecompiler does that work for us.


Computationally intensive tasks play to the strengths of C++ and areoften impossible to implement efficiently in pure Python, while jobslike serialization that are trivial in Python can be very difficult inpure C++.  Given the luxury of building a hybrid software system fromthe ground up, we can approach design with new confidence and power.




[VELD1995] T. Veldhuizen, "Expression Templates," C++ Report,Vol. 7 No. 5 June 1995, pp. 26-31.http://osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html



[1] In retrospect, it seems that "thinking hybrid" from theground up might have been better for the NLP system: thenatural component boundaries defined by the pure pythonprototype turned out to be inappropriate for getting thedesired performance and memory footprint out of the C++ core,which eventually caused some redesign overhead on the Pythonside when the core was moved to C++.


[2] We also have some reservations about driving all C++testing through a Python interface, unless that's the only wayit will be ultimately used.  Any transition across languageboundaries with such different object models can inevitablymask bugs.


[3] These features were expressed very differently in v1 ofBoost.Python

这些特性在Boost.Python v1里表达方式非常不同。

posted on 2008-05-29 13:11 金庆 阅读(7576) 评论(15)  编辑 收藏 引用 所属分类: 1. C/C++6. Python


# re: 用Boost.Python构建混合系统 2008-05-29 21:20 missdeer
  回复  更多评论

# re: 用Boost.Python构建混合系统[未登录] 2008-05-30 09:24 FongLuo
FongLuo@chipsbank.com  回复  更多评论

# re: 用Boost.Python构建混合系统 2008-05-30 14:50 金庆
Python一方面使用简单,另一方面是因为它流行。  回复  更多评论

# re: 用Boost.Python构建混合系统[未登录] 2008-06-02 17:38 FongLuo
这几天我也看了一下bost.Python,发现Boost1.35不再支持vc6,但我公司现阶段使用的IDE就是VC6,短期内没有转换的可能性。下一个产品极大可能使用ARM作为Host。我现在只能让Python和Lua同时成为待选项;等项目确定后,再作出选择;毕竟不能同时在两个语言之间投入精力。  回复  更多评论

# re: 用Boost.Python构建混合系统 2008-06-03 09:34 金庆
在ARM上开发,应该与VC6无关。至于Python与Lua的选择,我期待你及你公司的最终意见,好让大家参考。  回复  更多评论

# re: 用Boost.Python构建混合系统 2009-02-16 17:08 wq
你长得还可以,结婚了吗  回复  更多评论

# re: 用Boost.Python构建混合系统 2009-02-16 17:17 ll
有没有作DSP的工程师推荐一下吧  回复  更多评论

# re: 用Boost.Python构建混合系统 2009-10-13 19:55 liigo
Boost.Python,以前了解一些,但印象不深,看了此文,发现它使用起来确实简洁清爽,印象深刻。  回复  更多评论

# re: 用Boost.Python构建混合系统 2010-04-06 13:38 FredaAustin
The customers trust our
resume service cause they are the most responsible! This company represents <a href="http://www.prime-resume.com">resume writing services</a> to suit the precise field of research you prefer.  回复  更多评论

# re: 用Boost.Python构建混合系统 2010-09-10 16:13 buy an essay
Custom writing service would like suggest to buy an essay or essay for sale, because some students appreciate your release about this post*.   回复  更多评论

# re: 用Boost.Python构建混合系统 2010-12-06 00:56 writing service
There are lots various ways to get know just about this post . Thus, I suggest to buy essays and custom essays or written essays opting for the perfect essay writing services.   回复  更多评论

# re: 用Boost.Python构建混合系统 2011-01-15 01:11 online casino
many millionaires have earned their money through <a href="http://www.casino-spielen.biz">online casino</a>  回复  更多评论

# re: 用Boost.Python构建混合系统 2012-07-05 22:18 SCUHANK
博主你好,可以跟你请教下关于Boost.Python模块使用中遇到的问题,谢谢  回复  更多评论

# re: 用Boost.Python构建混合系统 2012-07-05 22:18 SCUHANK
博主你好,可以跟你请教下关于Boost.Python模块使用中遇到的问题,谢谢  回复  更多评论

# re: 用Boost.Python构建混合系统 2013-04-04 02:53 lgl
请问下 如何用vs2008 编译boost,,我现在编译出来的 对python 不支持, 提示找不到 pyconfig.h , 不知道怎么设置才能正确编译, 支持python ??

  回复  更多评论

# re: 用Boost.Python构建混合系统 2013-12-16 16:40 金庆
需要先安装python.  回复  更多评论

【推荐】超50万行VC++源码: 大型组态工控、电力仿真CAD与GIS源码库
网站导航: 博客园   IT新闻   BlogJava   知识库   博问   管理