牵着老婆满街逛

严以律己,宽以待人. 三思而后行.
GMail/GTalk: yanglinbo#google.com;
MSN/Email: tx7do#yahoo.com.cn;
QQ: 3 0 3 3 9 6 9 2 0 .

STL实践指南

作者:Jeff Bogan
原文:http://www.codeproject.com/vcpp/stl/PracticalGuideStl.asp
翻译:Winter
Winter注: 这是一篇非常不错的文章,以前周翔已经翻译过了。只是感觉翻译得有些欠妥之处,特别是一些术语的翻译,因此这里重新翻译。

1 介绍

对于当今所有C++程序员来说,STL(标准模板库的缩写)都是非常不错的技术。但我必须要提醒的是要想习惯使用有一定难度,例如,会有很陡峭的学习曲线,其使用许多名字也不是凭直觉就可以知道其意思(或许是因为所有好记的名字都被用光了)。但一旦你学会了STL,你将会因此而受益匪浅。和MFC的容器相比,STL更加灵活且功能强大。

其优势如下:

  1. 能方便的排序和搜索。
  2. 更安全且更容易调试。
  3. 你能读懂Unix程序员的代码注1
  4. 将为你的简历上增加技能。

2 背景

写本文档的目的在于让读者可以在这富有挑战性的计算机科学领域有个良好的开端,不必费力地了解那无穷无尽的行话术语和沉闷的规则,那些行话和规则只是STLer们用于自娱的创造品。

3 使用代码

本文档中的代码对读者在使用STL实践之路上有很强的指导作用。

4 定义

  • 模板(template)-- 类(以及结构、数据类型、和函数)的宏。有时也叫cookie cutter. 同时和已知范型(generic)形式一样--一个类模板叫范型类,同样,一个函数模板叫范型函数。
  • STL -- 标准模板库,由一群聪明人写的模板,现在作为标准C++语言的一部分被所有人使用。
  • 容器(container) -- 可容纳一定数据的类。在STL中有vector, set, map, multimap, deque等容器。
  • vector -- 一个基础的数据模板,是一种容器。
  • 迭代器(Iterator) -- 一个非常有意思的词,其实是STL容器内部元素的指针。它同时完成其他许多功能。

5 Hello Word 程序

I always wanted to write one and here is my golden 24 karet opportunity: a hello world program. 这个程序把一个字符串转换为一个字符vector,然后以逐个字符显示整个字符串。vector就像是盛放变长数组的花园,在STL所有容器中,大约有一半是基于vector的,故可以这么说,尚若你掌握了这个程序,那么你就理解了整个STL的一半了

// Program: Vector Demo 1
// Purpose: 用于演示STL vector

// #include "stdafx.h" - 如果你使用预编译需要包含此文件[[#ExplainIn2][注2]]
#include <vector>  // STL vector 头文件. 注意,并没有".h"
#include <iostream>  // 需要用到 cout
using namespace std;  // 确保命名空间是 std

char* szHW = "Hello World";  
// 众所周知,这是个以NULL结尾的字符数组 

int main(int argc, char* argv[])
{
  vector 
<char> vec;  // 一个字符类型的vector(相当于STL中的数组)

  
// 为字符vector定义迭代器
  vector <char>::iterator vi;

  
// 初始化字符vector,循环整个字符串,把每个字符放入vector中,直至字符串末尾的NULL字符
  char* cptr = szHW;  //  Hello World 字符串的首地址
  while (*cptr != '\0')
  
{  vec.push_back(*cptr);  cptr++;  }
  
// push_back 函数把数据插入vector的最后 

  
// 把存在STL数组中的每个字符打印到屏幕上
  for (vi=vec.begin(); vi!=vec.end(); vi++)  
  
// 这就是在STL中循环的标准判断方式- 经常使用 "!=" 而不是 "<" 
  
// 某些容器可能并没有重载操作符 "<" 。 
  
//begin()和end()会得到vector的开头和结尾两个元素的迭代器(指针) 
  {  cout << *vi;  }  // 使用间接操作符(*)从迭代器中取得数据
  cout << endl;  // 输出完毕,打印 "\n"

  
return 0;
}

push_back 是用来向vector或deque容器中插入数据的标准函数。insert是类似功能的函数,适用于所有容器,但用法更复杂。end()实际上表示在最后的位置再加一,以便循环可以正常执行 - 它返回的指针指向最靠近数组界限的数据。就像普通循环中的数组,比如for (i=0; i<6; i++) {ar[i] = i;} ——ar[6]是不存在的,在循环中不会达到这个元素,所以在循环中不会出现问题。

6 STL的烦恼之一:

STL令人烦恼的地方是在它初始化的时候。STL中容器的初始化比C/C++数组初始化要麻烦的多。你只能一个元素一个元素地来,或者先初始化一个普通数组再通过转化填放到容器中。我认为人们通常可以这样做:

// Program: Initialization Demo
// Purpose: To demonstrate initialization of STL vectors

#include 
<cstring>  // same as <string.h>
#include <vector>
using namespace std;

int ar[10= {  124523464123563231255  };
char* str = "Hello World";

int main(int argc, char* argv[])
{
  vector 
<int> vec1(ar, ar+10);
  vector 
<char> vec2(str, str+strlen(str));
  
return 0;
}

在编程中,有很多种方法来完成同样的工作。另一种填充向量的方法是用更加熟悉的方括号,例如:
// Program: Vector Demo 2
// Purpose: To demonstrate STL vectors with
// counters and square brackets

#include 
<cstring>
#include 
<vector>
#include 
<iostream>
using namespace std;

char* szHW = "Hello World";
int main(int argc, char* argv[])
{
  vector 
<char> vec(strlen(sHW)); 
  
// The argument initializes the memory footprint
  int i, k = 0;
  
char* cptr = szHW;
  
while (*cptr != '\0')
  
{  vec[k] = *cptr;  cptr++;  k++;  }
  
for (i=0; i<vec.size(); i++)
  
{  cout << vec[i];  }
  cout 
<< endl;
  
return 0;
}

这个例子更加清晰,但没有使用迭代器(iterator)操作,并且定义了额外的整数作为下标,而且,你必须清楚地在程序中说明为vector分配多少内存空间。

7 命名空间(namespace)

与STL相关的概念是命名空间(namespace)。STL定义在std命名空间中。有3种方法声明使用的命名空间:

  1. 用using关键字使用这个命名空间,在文件的顶部,但在声明的头文件下面加入:
using namespace std;
最于简单工程来说,这是最简单也是最佳方式。直接把你的代码定位到std命名空间,

This is the simplest and best for simple projects, limits you to the std namespace, anything you add is improperly put in the std namespace (I think you go to heck for doing this).

  1. Specify each and every template before use (like prototyping)

using std::cout; using std::endl; using std::flush; using std::set; using std::inserter;

This is slightly more tedious, although a good mnemonic for the functions that will be used, and you can interlace other namespaces easily.

  1. EVERY time you use a template from the std namespace, use the std scope specifier.

typedef std::vector VEC_STR;

This is tedious but the best way if you are mixing and matching lots of namespaces. Some STL zealots will always use this and call anyone evil who does not. Some people will create macros to simplify matters.

In addition, you can put using namespace std within any scope, for example, at the top of a function or within a control loop. Some Tips

To avoid an annoying error code in debug mode, use the following compiler pragma:

#pragma warning(disable: 4786)

Another gotcha is: you must make sure that the spaces are placed between your angle brackets and the name. This is because >> is the bit shift operator, so:

vector <list<int>> veclis;

will give an error. Instead, write it:

vector > veclis;

to avoid compilation errors. Another Container - The set

This is the explanation lifted from the MS help file of the set: "The template class describes an object that controls a varying-length sequence of elements of type const Key. Each element serves as both a sort key and a value. The sequence is represented in a way that permits lookup, insertion, and removal of an arbitrary element with a number of operations proportional to the logarithm of the number of elements in the sequence (logarithmic time). Moreover, inserting an element invalidates no iterators, and removing an element invalidates only those iterators that point at the removed element."

An alternate, more practical, definition is: A set is a container that contains all unique values. This is useful for cases in which you are required to collect the occurrence of value. It is sorted in an order that is specified at the instantiation of the set. If you need to store data with a key/value pair, then a map is a better choice. A set is organized as a linked list, is faster than a vector on insertion and removal, but slightly slower on search and addition to end.

An example program would be:

// Program: Set Demo // Purpose: To demonstrate STL sets

#include #include #include using namespace std;

int main(int argc, char* argv[]) { set strset; set ::iterator si; strset.insert("cantaloupes"); strset.insert("apple"); strset.insert("orange"); strset.insert("banana"); strset.insert("grapes"); strset.insert("grapes"); // This one overwrites the previous occurrence for (si=strset.begin(); si!=strset.end(); si++) { cout << *si << " "; } cout << endl; return 0; }

// Output: apple banana cantaloupes grapes orange

If you want to become an STL fanatic, you can also replace the output loop in the program with the following lines.

copy(strset.begin(), strset.end(), ostream_iterator(cout, " "));

While instructive, I find this personally less clear and prone to error. If you see it, now you know what it does. All the STL Containers

Containers pre-date templates and are computer science concepts that have been incorporated into STL. The following are the seven containers implemented in STL.

* vector - Your standard safe array. It is expanded in the "front" direction only. * deque - Functionally the same as a vector. Internally, it is different. It can be expanded in both the front and back. * list - Can only be traversed one step at time. If you are already familiar with the concept of a list, an STL list is doubly linked (contains pointer to both the previous and next value). * set - contains unique values that are sorted. * map - sorted set of paired values, one of which is the key on which sorts and searches occur, and the value which is retrieved from the container. E.g. instead of ar[43] = "overripe", a map lets you do this ar["banana"] = "overripe". So if you wanted to draw up a bit of information keyed on full name is easily done. * multiset - same as a set, but does not necessarily have unique values. * multimap - same as a map, but does not necessarily have unique keys.

Note: If you are reading the MFC help then you will also come across the efficiency statement of each container. I.E. (log n * n) insertion time. Unless you are dealing with very large number of values, you should ignore this. If you start to get a noticeable lag or are dealing with time critical stuff then you should learn more about the proper efficiency of various containers. How to Use a Map with some Class

The map is a template that uses a key to obtain a value.

Another issue is that you will want to use your own classes instead of data types, like int that has been used up to now. To create a class that is "template-ready", you must be ensure that the class contains certain member functions and operators. The basics are:

* default constructor (empty, usually) * copy constructor * overload "="

You would overload more operators as required in a specific template, for example, if you plan to have a class that is a key in a map you would have to overload relational operators. But that is another story.

// Program: Map Own Class // Purpose: To demonstrate a map of classes

#include #include #include #include using namespace std;

class CStudent { public : int nStudentID; int nAge; public : // Default Constructor - Empty CStudent() { } // Full constructor CStudent(int nSID, int nA) { nStudentID=nSID; nAge=nA; } // Copy constructor CStudent(const CStudent& ob) { nStudentID=ob.nStudentID; nAge=ob.nAge; } // Overload = void operator = (const CStudent& ob) { nStudentID=ob.nStudentID; nAge=ob.nAge; } };

int main(int argc, char* argv[]) { map mapStudent;

mapStudent["Joe Lennon"] = CStudent(103547, 22); mapStudent["Phil McCartney?"] = CStudent(100723, 22); mapStudent["Raoul Starr"] = CStudent(107350, 24); mapStudent["Gordon Hamilton"] = CStudent(102330, 22);

// Access via the name cout << "The Student number for Joe Lennon is " << (mapStudent["Joe Lennon"].nStudentID) << endl;

return 0; }

TYPEDEF

If you like to use typedef, this an example:

typedef set SET_INT; typedef SET_INT::iterator SET_INT_ITER

One convention is to make them upper case with underscores. ANSI / ISO string

ANSI/ISO strings are commonly used within STL containers. It is your standard string class, widely praised except for its deficiency of no format statement. You must instead use << and the iostream codes (dec, width, etc.) to string together your string.

Use c_str() to retrieve a character pointer, when necessary. Iterators

I said that iterators are pointers, but there is more. They look like pointers, act like pointers, but they are actually embedded in which the indirection operator (unary *) and -> have been overloaded to return a value from the container. It is a bad idea to store them for any length of time, as they usually invalid after a value has been added or removed from a container. They are something like handles in this regard. The plain iterator can be altered, so that the container is to be traversed in different ways:

* iterator - For any container other than the vector, you can only step one at a time in a forward direction through the container. That is you can only use the ++ operator, not the -- or += operator on it. For vector only you can use any of +=, --, -=, ++, and all the comparison operators <, <=, >, >=, =, . * reverse_iterator - If you want to step backwards instead of forwards through a non-vector container, replace iterator with reverse_iterator, begin() with rbegin(), and end() with rend(), ++ will then traverse backwards. * const_iterator - a forward iterator that returns a const value. Use this if you want to make it clear that this points to a read-only value. * const_reverse_iterator - a reverse iterator that returns a const value.

Sorting Order in Sets and Maps

Templates have other parameters besides the type of value. You can also pass callback functions (known as predicates - this is a function of one argument that returns a bool value). For example, say you want a set of strings that are automatically sorting in ascending order. You would simply create a set class in this way:

set > set1

greater is another template for a function (generic function) which is used to sort values, as they are placed into the container. If you wanted the set to be sorted in descending order, you would write:

set > set1

There are many other cased you must pass a predicate as parameter to the STL class, in algorithms, described below. STL Annoyance #2 - Long Error Messages

The templated names get expanded for the compiler, so when the compiler chokes on something, it spits out extremely long error messages that are difficult to read. I have found no good way around this. The best is to develop the ability to find and focus on the end of the error code where the explanation is located. Another related annoyance: if you double click on the template error, it will take you to the point in the code within the template code, which is also difficult to read. Sometimes, it is best just to carefully re-examine your code, and ignore the error messages completely. Algorithms

Algorithms are functions that apply to templates. This is where the real power of STL starts to show up. You can learn a few function names that usually apply to most of the template containers. You can sort, search, manipulate, and swap with the greatest of ease. They always contain a range within which the algorithm performs. E.g.: sort(vec.begin()+1, vec.end()-1) sorts everything but the first and last values.

The container itself is not passed to the algorithm, just two iterators from the container that bookend a range. In this way, algorithms are not restricted by containers directly, but by the iterators supported by that specific algorithm. In addition, many times you will also pass a name of a specially prepared function (those afore mentioned predicates) as an argument. You can even pass plain old values.

Example of algorithms in play:

 

// Program: Test Score // Purpose: To demonstrate the use of algorithm // with respect to a vector of test scores 

#include 
// If you want to use an // algorithm this is the header used. #include // (For Accumulate) #include #include using namespace std; 

int testscore[] = {67562478998756}

// predicate that evaluates a passed test bool passed_test(int n) { return (n >= 60); } 

// predicate that evaluates a failed test bool failed_test(int n) { return (n < 60); } 

int main(int argc, char* argv[]) int total; // Initialize a vector with the data in the testscore array vector vecTestScore(testscore, testscore + sizeof(testscore) / sizeof(int)); vector ::iterator vi; 

// Sort and display the vector sort(vecTestScore.begin(), vecTestScore.end()); cout << "Sorted Test Scores:" << endl; for (vi=vecTestScore.begin(); vi = vecTestScore.end(); vi++) { cout << *vi << ", "; } cout << endl; 

// Display statistics 

// min_element returns an iterator to the // element that is the minimum value in the range // Therefor * operator must be used to extract the value vi = min_element(vecTestScore.begin(), vecTestScore.end()); cout << "The lowest score was " << *vi << "." << endl; 

// Same with max_element vi = max_element(vecTestScore.begin(), vecTestScore.end()); cout << "The highest score was " << *vi << "." << endl; 

// Use a predicate function to determine the number who passed cout << count_if(vecTestScore.begin(), vecTestScore.end(), passed_test) << " out of " << vecTestScore.size() << " students passed the test" << endl; 

// and who failed cout << count_if(vecTestScore.begin(), vecTestScore.end(), failed_test) << " out of " << vecTestScore.size() << " students failed the test" << endl; 

// Sum the scores total = accumulate(vecTestScore.begin(), vecTestScore.end(), 0); // Then display the Average cout << "Average score was " << (total / (int)(vecTestScore.size())) << endl; 

return 0; }
 

 

See you later, Allocator

These are used in the initialization stages of a template. They are mysterious behind the scenes type of creatures, and really only of concern if you are doing high level memory optimization, and are best considered to be black boxes. Usually, you never even specify them as they are default parameters that are generally not tinkered with. It is best to know what they are though in case they show up on one of those employment tests. Derive or Embed Templates

Any way that you use a regular class, you can use an STL class.

It can be embedded:

 

class CParam string name; string unit; vector vecData; }

 

or used as a base class:

 

class CParam : public vector string name; string unit; }

 

Derivation should be used with some caution. It is up to you as to the form that fits your programming style. Templates within Templates

To create a more complex data structure, you can nest a template within a template. It is best to typedef beforehand on the internal template as you will certainly need to use the inner template again.

 

// Program: Vector of Vectors Demo // Purpose: To demonstrate nested STL containers 

#include #include 

using namespace std; 

typedef vector VEC_INT; 

int inp[2][2= {{11}{20}}// Regular 2x2 array to place into the template int main(int argc, char* argv[]) { int i, j; vector vecvec; // if you want to do this in all one step it looks like this // vector > vecvec; 

// Fill it in with the array VEC_INT v0(inp[0], inp[0]+2); // passing two pointers // for the range of values to be copied to the vector VEC_INT v1(inp[1], inp[1]+2); 

vecvec.push_back(v0); vecvec.push_back(v1); 

for (i=0; i<2; i++for (j=0; j<2; j++{ cout << vecvec[i][j] << " "; } cout << endl; } return 0; } 

 

// Output: // 1 1 // 2 0

Although cumbersome to initialize, once completed and filled in, you have a 2 dimensional array that is indefinitely expandable (until memory space runs out). The same can be done for any combination of containers, as the situation requires. Conclusion

STL is useful, but not without its annoyances. As the Chinese say: if you learn it, you will be like a tiger with the claws of a lion.

20 译者注

[ 注1 ] STL是可以跨平台的,此处是对使用Windows平台的读者来说。
[ 注2 ] 在使用VC时,如果选择预编译,则每个cpp文件开头必须包含一个头文件:"stdafx.h"

-- WinterWen - 25 Jun 2005

posted on 2006-04-25 22:41 杨粼波 阅读(476) 评论(0)  编辑 收藏 引用 所属分类: 文章收藏


只有注册用户登录后才能发表评论。
网站导航: 博客园   IT新闻   BlogJava   知识库   博问   管理