Subsections


10. 标准库概览 Brief Tour of the Standard Library


10.1 操作系统概览 Operating System Interface

The os module provides dozens of functions for interacting with the operating system:

os 模块提供了不少与操作系统相关联的函数。

>>> import os
>>> os.system('time 0:02')
0
>>> os.getcwd()      # Return the current working directory
'C:\\Python24'
>>> os.chdir('/server/accesslogs')

Be sure to use the "import os" style instead of "from os import *". This will keep os.open() from shadowing the builtin open() function which operates much differently.

应该用 "import os" 风格而非 "from os import *"。这样可以保证随操作系统不同而有所变化的 os.open() 不会覆盖内置函数 open()

The builtin dir() and help() functions are useful as interactive aids for working with large modules like os:

在使用一些像 os 这样的大型模块时内置的 dir()help() 函数非常有用。

>>> import os
>>> dir(os)
<returns a list of all module functions>
>>> help(os)
<returns an extensive manual page created from the module's docstrings>

For daily file and directory management tasks, the shutil module provides a higher level interface that is easier to use:

针对日常的文件和目录管理任务,shutil 模块提供了一个易于使用的高级接口。

>>> import shutil
>>> shutil.copyfile('data.db', 'archive.db')
>>> shutil.move('/build/executables', 'installdir')


10.2 文件通配符 File Wildcards

The glob module provides a function for making file lists from directory wildcard searches:

glob 模块提供了一个函数用于从目录通配符搜索中生成文件列表。

>>> import glob
>>> glob.glob('*.py')
['primes.py', 'random.py', 'quote.py']


10.3 命令行参数 Command Line Arguments

Common utility scripts often invoke processing command line arguments. These arguments are stored in the sys module's argv attribute as a list. For instance the following output results from running "python demo.py one two three" at the command line:

通用工具脚本经常调用命令行参数。这些命令行参数以链表形式存储于 sys 模块的 argv 变量。例如在命令行中执行 "python demo.py one two three" 后可以得到以下输出结果:

>>> import sys
>>> print sys.argv
['demo.py', 'one', 'two', 'three']

The getopt module processes sys.argv using the conventions of the Unix getopt() function. More powerful and flexible command line processing is provided by the optparse module.

getopt 模块使用 Unix getopt() 函处理 sys.argv。更多的复杂命令行处理由 optparse 模块提供。


10.4 错误输出重定向和程序终止 Error Output Redirection and Program Termination

The sys module also has attributes for stdin, stdout, and stderr. The latter is useful for emitting warnings and error messages to make them visible even when stdout has been redirected:

sys 还有 stdinstdoutstderr 属性,即使在 stdout 被重定向时,后者也可以用于显示警告和错误信息。

>>> sys.stderr.write('Warning, log file not found starting a new one')
Warning, log file not found starting a new one

The most direct way to terminate a script is to use "sys.exit()".

大多脚本的定向终止都使用 "sys.exit()"。


10.5 字符串正则匹配 String Pattern Matching

The re module provides regular expression tools for advanced string processing. For complex matching and manipulation, regular expressions offer succinct, optimized solutions:

re 模块为高级字符串处理提供了正则表达式工具。对于复杂的匹配和处理,正则表达式提供了简洁、优化的解决方案。

>>> import re
>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot', 'fell', 'fastest']
>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'

When only simple capabilities are needed, string methods are preferred because they are easier to read and debug:

如果只需要简单的功能,应该首先考虑字符串方法,因为它们非常简单,易于阅读和调试。

>>> 'tea for too'.replace('too', 'two')
'tea for two'


10.6 数学 Mathematics

The math module gives access to the underlying C library functions for floating point math:

math 模块为浮点运算提供了对底层C函数库的访问。

>>> import math
>>> math.cos(math.pi / 4.0)
0.70710678118654757
>>> math.log(1024, 2)
10.0

The random module provides tools for making random selections:

random 提供了生成随机数的工具。

>>> import random
>>> random.choice(['apple', 'pear', 'banana'])
'apple'
>>> random.sample(xrange(100), 10)   # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random()    # random float
0.17970987693706186
>>> random.randrange(6)    # random integer chosen from range(6)
4


10.7 互联网访问 Internet Access

There are a number of modules for accessing the internet and processing internet protocols. Two of the simplest are urllib2 for retrieving data from urls and smtplib for sending mail:

有几个模块用于访问互联网以及处理网络通信协议。其中最简单的两个是用于处理从 urls 接收的数据的 urllib2 以及用于发送电子邮件的 smtplib

>>> import urllib2
>>> for line in urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl'):
...     if 'EST' in line:      # look for Eastern Standard Time
...         print line

<BR>Nov. 25, 09:43:32 PM EST

>>> import smtplib
>>> server = smtplib.SMTP('localhost')
>>> server.sendmail('soothsayer@tmp.org', 'jceasar@tmp.org',
"""To: jceasar@tmp.org
From: soothsayer@tmp.org

Beware the Ides of March.
""")
>>> server.quit()


10.8 日期和时间 Dates and Times

The datetime module supplies classes for manipulating dates and times in both simple and complex ways. While date and time arithmetic is supported, the focus of the implementation is on efficient member extraction for output formatting and manipulation. The module also supports objects that are time zone aware.

datetime 模块为日期和时间处理同时提供了简单和复杂的方法。支持日期和时间算法的同时,实现的重点放在更有效的处理和格式化输出。该模块还支持时区处理。

# dates are easily constructed and formatted
>>> from datetime import date
>>> now = date.today()
>>> now
datetime.date(2003, 12, 2)
>>> now.strftime("%m-%d-%y or %d%b %Y is a %A on the %d day of %B")
'12-02-03 or 02Dec 2003 is a Tuesday on the 02 day of December'

# dates support calendar arithmetic
>>> birthday = date(1964, 7, 31)
>>> age = now - birthday
>>> age.days
14368


10.9 数据压缩 Data Compression

Common data archiving and compression formats are directly supported by modules including: zlib, gzip, bz2, zipfile, and tarfile.

以下模块直接支持通用的数据打包和压缩格式:

zlibgzipbz2zipfile, 以及 tarfile

>>> import zlib
>>> s = 'witch which has which witches wrist watch'
>>> len(s)
41
>>> t = zlib.compress(s)
>>> len(t)
37
>>> zlib.decompress(t)
'witch which has which witches wrist watch'
>>> zlib.crc32(t)
-1438085031


10.10 性能度量 Performance Measurement

Some Python users develop a deep interest in knowing the relative performance between different approaches to the same problem. Python provides a measurement tool that answers those questions immediately.

有些用户对了解解决同一问题的不同方法之间的性能差异很感兴趣。Python 提供了一个度量工具,为这些问题提供了直接答案。

For example, it may be tempting to use the tuple packing and unpacking feature instead of the traditional approach to swapping arguments. The timeit module quickly demonstrates that the traditional approach is faster:

例如,使用元组封装和拆封来交换元素看起来要比使用传统的方法要诱人的多。timeit 证明了传统的方法更快一些。

>>> from timeit import Timer
>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit()
0.60864915603680925
>>> Timer('a,b = b,a', 'a=1; b=2').timeit()
0.8625194857439773

In contrast to timeit's fine level of granularity, the profile and pstats modules provide tools for identifying time critical sections in larger blocks of code.

相对于 timeit 的细粒度,profilepstats 模块提供了针对更大代码块的时间度量工具。


10.11 质量控制 Quality Control

One approach for developing high quality software is to write tests for each function as it is developed and to run those tests frequently during the development process.

开发高质量软件的方法之一是为每一个函数开发测试代码,并且在开发过程中经常进行测试。

The doctest module provides a tool for scanning a module and validating tests embedded in a program's docstrings. Test construction is as simple as cutting-and-pasting a typical call along with its results into the docstring. This improves the documentation by providing the user with an example and it allows the doctest module to make sure the code remains true to the documentation:

doctest 模块提供了一个工具,扫描模块并根据程序中内嵌的文档字符串执行测试。测试构造如同简单的将它的输出结果剪切并粘贴到文档字符串中。通过用户提供的例子,它发展了文档,允许 doctest 模块确认代码的结果是否与文档一致。

def average(values):
    """Computes the arithmetic mean of a list of numbers.

    >>> print average([20, 30, 70])
    40.0
    """
    return sum(values, 0.0) / len(values)

import doctest
doctest.testmod()   # automatically validate the embedded tests

The unittest module is not as effortless as the doctest module, but it allows a more comprehensive set of tests to be maintained in a separate file:

unittest 模块不像 doctest 模块那么容易使用,不过它可以在一个独立的文件里提供一个更全面的测试集。

import unittest

class TestStatisticalFunctions(unittest.TestCase):

    def test_average(self):
        self.assertEqual(average([20, 30, 70]), 40.0)
        self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
        self.assertRaises(ZeroDivisionError, average, [])
        self.assertRaises(TypeError, average, 20, 30, 70)

unittest.main() # Calling from the command line invokes all tests


10.12 Batteries Included

Python has a ``batteries included'' philosophy. This is best seen through the sophisticated and robust capabilities of its larger packages. For example:

Python 体现了“batteries included”哲学。Python 可以通过更大的包的来得到应付各种复杂情况的强大能力,从这一点我们可以看出该思想的应用。例如:

* The xmlrpclib and SimpleXMLRPCServer modules make implementing remote procedure calls into an almost trivial task. Despite the names, no direct knowledge or handling of XML is needed.

* xmlrpclibSimpleXMLRPCServer 模块实现了在琐碎的任务中调用远程过程。尽管有这样的名字,其实用户不需要直接处理 XML ,也不需要这方面的知识。

* The email package is a library for managing email messages, including MIME and other RFC 2822-based message documents. Unlike smtplib and poplib which actually send and receive messages, the email package has a complete toolset for building or decoding complex message structures (including attachments) and for implementing internet encoding and header protocols.

* email 包是一个邮件消息管理库,可以处理 MIME 或其它基于 RFC 2822 的消息文档。不同于实际发送和接收消息的 smtplibpoplib 模块,email 包有一个用于构建或解析复杂消息结构(包括附件)以及实现互联网编码和头协议的完整工具集。

* The xml.dom and xml.sax packages provide robust support for parsing this popular data interchange format. Likewise, the csv module supports direct reads and writes in a common database format. Together, these modules and packages greatly simplify data interchange between python applications and other tools.

* xml.domxml.sax 包为流行的信息交换格式提供了强大的支持。同样, csv 模块支持在通用数据库格式中直接读写。综合起来,这些模块和包大大简化了 Python 应用程序和其它工具之间的数据交换。

* Internationalization is supported by a number of modules including gettext, locale, and the codecs package.

* 国际化由 gettextlocalecodecs 包支持。