Zero Lee的专栏

The Return Value Optimization[2]

------->
Without any optimization, the compile-generated(pseduo) code for Complex_Add() is

 1 void  Complex_Add( const  Complex &  __tempResult,  const  Complex &  c1,  const  Complex &  c2)
 2
{
 3     struct
 Complex retVal;
 4    retVal.Complex::Complex();  //  Constructor retval

 5    retVal.real  =  a.real  +  b.real;
 6    retVal.imag  =  a.imag  +
 b.imag;
 7    __tempResult.Complex::Complex(retVal);  //  copy-constructor

 8   retVal.Complex:: ~ Complex();  //  destroy retVal
 9    return ;
10 }
The compiler can optimize Complex_Add() by eliminating the local object retVal and replacing it with __tempResult. This is the Return Value Optimization:
1 void  Complex_Add( const  Complex &  __tempResult,  const  Complex &  c1,  const  Complex &  c2)
2
{
3    __tempResult.Complex::Complex();   //  construcotr __tempResult

4    __tempResult.real  =  a.real  +  b.real;
5    __tempResult.imag  =  a.imag  +
 b.imag;
6     return
;
7 }
The RVO eliminated the local retVal object and therefore saved us a constructor as well as a destructor computation.
To get a numerical feel for all this efficiency discussion, we measured the impact of RVO on execution speed. We coded two versions of operator +(), one of which was optimized and the other not. The measured code consisted of a million loop iterations:
 1 int  main()
 2
{
 3    Complex a( 1 , 0
);
 4    Complex b( 2 , 0
);
 5
   Complex c;
 6     //  begin timing here

 7     for  ( int  i  =   1000000 ; i  >   0 -- i)  {
 8       c  =  a  +
 b;
 9    }

10     //  stoping timing here
11 }
The second version, without RVO, executed in 1.89 seconds. The first version, with RVO applied was much faster --1.30 seconds.

Compiler optimizations, naturally, must preserve the correctness of the original computation. In the case of the RVO, this is not always easy. Since the RVO is not mandatory, the compiler will not perform it on comlicated functions. For example, if the function has multiple return statements returning objects of different names, RVO will not be applied. You must return the same named object to have a chance at the RVO.
One compiler we tested refused to apply the RVO to this particular version of operator +:
1Complex operator +(const Complex& a, const Complex& b)
2{
3   // operator + version 1
4   Complex retVal;
5   retVal.real = a.real + b.real;
6   retVal.imag = a.imag + b.imag;
7   return retVal;
8}
It did, however, apply the RVO to this version:
1Complex operator +(const Complex& a, const Complex& b)
2{
3   // operator + version 2
4   double r = a.real + b.real;
5   double i = a.imag + b.imag;
6   return Complex(r, i);
7}

8   
We speculated that the difference may lie in the fact that Version 1 used a named variable(retVal) as a return value whereas Version 2 used an unnamed variable. Version 2 used a constructor call in the return statement but never named it. It may be the case that this particular compiler implementation chose to avoid optimizing away named variables.
Our speculation was boosted by some additional evidence. We tested two more versions of operator +:
 1Complex operator +(const Complex& a, const Complex& b)
 2{
 3   // operator + version 3
 4   Complex retVal(a.real + b.real, a.imag + b.imag);
 5   return retVal;
 6}

 7and
 8Complex operator +(const Complex& a, const Complex& b)
 9{
10   // operator + version 4
11   return Complex(a.real + b.real, a.imag + b.imag);
12}
As speculated, the RVO was applied to Version 4 but not to Version 3.
In addition, you must also define a copy constructor to "Turn on" the Return Value Optimization. If the class involved does not have a copy constructor defined, the RVO is quietly turned off.

Key Points:
[1] If you must return an object by value, the Return Value Optimization will help performance by eliminating the nedd for creation and destruction of a local object.

[2] The application of the RVO is up to the direction of the compiler implementation. You need to consult your compile documentation or experiment to find if and when RVO is applied.

[3] You will have a better shot at RVO by deploying the computational constructor.

posted on 2006-11-13 19:36 Zero Lee 阅读(272) 评论(0)  编辑 收藏 引用 所属分类: C++ Performance


只有注册用户登录后才能发表评论。
网站导航: 博客园   IT新闻   BlogJava   知识库   博问   管理