﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>C++博客-beautykingdom-随笔分类-Compiling Theorem</title><link>http://www.cppblog.com/beautykingdom/category/7624.html</link><description /><language>zh-cn</language><lastBuildDate>Mon, 21 Jul 2008 02:12:25 GMT</lastBuildDate><pubDate>Mon, 21 Jul 2008 02:12:25 GMT</pubDate><ttl>60</ttl><item><title>编译器&lt;转&gt;</title><link>http://www.cppblog.com/beautykingdom/archive/2008/07/21/56722.html</link><dc:creator>chatler</dc:creator><author>chatler</author><pubDate>Mon, 21 Jul 2008 00:57:00 GMT</pubDate><guid>http://www.cppblog.com/beautykingdom/archive/2008/07/21/56722.html</guid><wfw:comment>http://www.cppblog.com/beautykingdom/comments/56722.html</wfw:comment><comments>http://www.cppblog.com/beautykingdom/archive/2008/07/21/56722.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.cppblog.com/beautykingdom/comments/commentRss/56722.html</wfw:commentRss><trackback:ping>http://www.cppblog.com/beautykingdom/services/trackbacks/56722.html</trackback:ping><description><![CDATA[<div style="text-indent: 9pt;"><span style="font-size: 9pt; color: maroon;">编译器，是将便于我们编写，阅读，维护的高级计算机语言翻译为计算机能识别，运行的低级机器语言的程序。编译器将源程序（</span><span style="font-size: 9pt; color: maroon;">Source program</span><span style="font-size: 9pt; color: maroon;">）作为输入，翻译产生使用目标语言（</span><span style="font-size: 9pt; color: maroon;">Target language</span><span style="font-size: 9pt; color: maroon;">）的等价程序。源程序一般为高级语言（</span><span style="font-size: 9pt; color: maroon;">High-level language</span><span style="font-size: 9pt; color: maroon;">），如</span><span style="font-size: 9pt; color: maroon;">Pascal</span><span style="font-size: 9pt; color: maroon;">，</span><span style="font-size: 9pt; color: maroon;">C++</span><span style="font-size: 9pt; color: maroon;">等，而目标语言则是汇编语言或目标机器的目标代码（</span><span style="font-size: 9pt; color: maroon;">Object code</span><span style="font-size: 9pt; color: maroon;">），有时也称作机器代码（</span><span style="font-size: 9pt; color: maroon;">Machine code</span><span style="font-size: 9pt; color: maroon;">）。</span><span style="font-size: 9pt; color: maroon;"><br><br></span><span style="font-size: 9pt; color: maroon;">一个现代编译器的主要工作流程如下：</span><span style="font-size: 9pt; color: maroon;"><br></span><span style="font-size: 9pt; color: maroon;">源程序（</span><span style="font-size: 9pt; color: maroon;">source code</span><span style="font-size: 9pt; color: maroon;">）&#8594;预处理器（</span><span style="font-size: 9pt; color: maroon;">preprocessor</span><span style="font-size: 9pt; color: maroon;">）&#8594;编译器（</span><span style="font-size: 9pt; color: maroon;">compiler</span><span style="font-size: 9pt; color: maroon;">）&#8594;汇编程序（</span><span style="font-size: 9pt; color: maroon;">assembler</span><span style="font-size: 9pt; color: maroon;">）&#8594;目标程序（</span><span style="font-size: 9pt; color: maroon;">object code</span><span style="font-size: 9pt; color: maroon;">）&#8594;连接器（链接器，</span><span style="font-size: 9pt; color: maroon;">Linker</span><span style="font-size: 9pt; color: maroon;">）&#8594;可执行程序（</span><span style="font-size: 9pt; color: maroon;">executables</span><span style="font-size: 9pt; color: maroon;">）</span><span style="font-size: 9pt; color: maroon;"> <br>&nbsp;<br></span><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">工作原理</span></strong><span style="font-size: 9pt; color: maroon;"><br>&nbsp;</span><span style="font-size: 9pt; color: maroon;">翻
译是从源代码（通常为高级语言）到能直接被计算机或虚拟机执行的目标代码（通常为低级语言或机器言）。然而，也存在从低级语言到高级语言的编译器，这类编
译器中用来从由高级语言生成的低级语言代码重新生成高级语言代码的又被叫做反编译器。也有从一种高级语言生成另一种高级语言的编译器，或者生成一种需要进
一步处理的的中间代码的编译器（又叫级联）。</span><span style="font-size: 9pt; color: maroon;"><br><br>&nbsp;</span><span style="font-size: 9pt; color: maroon;">典型的编译器输出是由包含入口点的名字和地址以及外部调用的机器代码所组成的目标文件。一组目标文件，不必是同一编译器产生，但使用的编译器必需采用同样的输出格式，可以链接在一起并生成可以由用户直接执行的可执行程序。</span><span style="font-size: 9pt; color: maroon;"><br><br></span><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">编译器种类</span></strong><span style="font-size: 9pt; color: maroon;"><br>&nbsp;</span><span style="font-size: 9pt; color: maroon;">编
译器可以生成用来在与编译器本身所在的计算机和操作系统（平台）相同的环境下运行的目标代码，这种编译器又叫做&#8220;本地&#8221;编译器。另外，编译器也可以生成用
来在其它平台上运行的目标代码，这种编译器又叫做交叉编译器。交叉编译器在生成新的硬件平台时非常有用。&#8220;源码到源码编译器&#8221;是指用一种高级语言作为输
入，输出也是高级语言的编译器。例如</span><span style="font-size: 9pt; color: maroon;">: </span><span style="font-size: 9pt; color: maroon;">自动并行化编译器经常采用一种高级语言作为输入，转换其中的代码，并用并行代码注释对它进行注释（如</span><span style="font-size: 9pt; color: maroon;">OpenMP</span><span style="font-size: 9pt; color: maroon;">）或者用语言构造进行注释（如</span><span style="font-size: 9pt; color: maroon;">FORTRAN</span><span style="font-size: 9pt; color: maroon;">的</span><span style="font-size: 9pt; color: maroon;">DOALL</span><span style="font-size: 9pt; color: maroon;">指令）。</span><span style="font-size: 9pt; color: maroon;"><br><br></span><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">预处理器（</span></strong><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">preprocessor</span></strong><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">）</span></strong><span style="font-size: 9pt; color: maroon;"><br>&nbsp;</span><span style="font-size: 9pt; color: maroon;">作用是通过代入预定义等程序段将源程序补充完整。</span><span style="font-size: 9pt; color: maroon;"><br><br></span><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">编译器前端（</span></strong><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">frontend</span></strong><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">）</span></strong><span style="font-size: 9pt; color: maroon;"><br>&nbsp;</span><span style="font-size: 9pt; color: maroon;">前端主要负责解析（</span><span style="font-size: 9pt; color: maroon;">parse</span><span style="font-size: 9pt; color: maroon;">）输入的源程序，由词法分析器和语法分析器协同工作。词法分析器负责把源程序中的&#8216;单词&#8217;（</span><span style="font-size: 9pt; color: maroon;">Token</span><span style="font-size: 9pt; color: maroon;">）找出来</span><span style="font-size: 9pt; color: maroon;">,</span><span style="font-size: 9pt; color: maroon;">语法分析器把这些分散的单词按预先定义好的语法组装成有意义的表达式，语句</span><span style="font-size: 9pt; color: maroon;">，函数等等。</span><span style="font-size: 9pt; color: maroon;">例如&#8220;</span><span style="font-size: 9pt; color: maroon;">a = b + c;</span><span style="font-size: 9pt; color: maroon;">&#8221;前端词法分析器看到的是&#8220;</span><span style="font-size: 9pt; color: maroon;">a, =, b , +, c;</span><span style="font-size: 9pt; color: maroon;">&#8221;，语法分析器按定义的语法，先把他们组装成表达式&#8220;</span><span style="font-size: 9pt; color: maroon;">b + c</span><span style="font-size: 9pt; color: maroon;">&#8221;，再组装成&#8220;</span><span style="font-size: 9pt; color: maroon;">a = b + c</span><span style="font-size: 9pt; color: maroon;">&#8221;的语句。</span><span style="font-size: 9pt; color: maroon;">前端还负责语义（</span><span style="font-size: 9pt; color: maroon;">semantic checking</span><span style="font-size: 9pt; color: maroon;">）的检查，例如检测参与运算的变量是否是同一类型的，简单的错误处理。最终的结果常常是一个抽象的语法树（</span><span style="font-size: 9pt; color: maroon;">abstract syntax tree</span><span style="font-size: 9pt; color: maroon;">，或</span><span style="font-size: 9pt; color: maroon;"> AST</span><span style="font-size: 9pt; color: maroon;">），这样后端可以在此基础上进一步优化，处理。</span><span style="font-size: 9pt; color: maroon;"><br><br></span><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">编译器后端（</span></strong><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">backend</span></strong><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">）</span></strong><span style="font-size: 9pt; color: maroon;"><br>&nbsp;</span><span style="font-size: 9pt; color: maroon;">编译器后端主要负责分析，优化中间代码（</span><span style="font-size: 9pt; color: maroon;">Intermediate representation</span><span style="font-size: 9pt; color: maroon;">）以及生成机器代码（</span><span style="font-size: 9pt; color: maroon;">Code Generation</span><span style="font-size: 9pt; color: maroon;">）。</span><span style="font-size: 9pt; color: maroon;"><br><br>&nbsp;</span><span style="font-size: 9pt; color: maroon;">一般说来所有的编译器分析，优化，变型都可以分成两大类：</span><span style="font-size: 9pt; color: maroon;">函数内（</span><span style="font-size: 9pt; color: maroon;">intraprocedural</span><span style="font-size: 9pt; color: maroon;">）还是函数之间（</span><span style="font-size: 9pt; color: maroon;">interprocedural</span><span style="font-size: 9pt; color: maroon;">）进行。很明显，函数间的分析，优化更准确，但需要更长的时间来完成。</span><span style="font-size: 9pt; color: maroon;"><br><br>&nbsp;</span><span style="font-size: 9pt; color: maroon;">编译器分析（</span><span style="font-size: 9pt; color: maroon;">compiler analysis</span><span style="font-size: 9pt; color: maroon;">）的对象是前端生成并传递过来的中间代码，现代的优化型编译器（</span><span style="font-size: 9pt; color: maroon;">optimizing compiler</span><span style="font-size: 9pt; color: maroon;">）常常用好几种层次的中间代码来表示程序，高层的中间代码（</span><span style="font-size: 9pt; color: maroon;">high level IR</span><span style="font-size: 9pt; color: maroon;">）接近输入的源程序的格式，与输入语言相关（</span><span style="font-size: 9pt; color: maroon;">language dependent</span><span style="font-size: 9pt; color: maroon;">），包含更多的全局性的信息，和源程序的结构；中层的中间代码（</span><span style="font-size: 9pt; color: maroon;">middle level IR</span><span style="font-size: 9pt; color: maroon;">）与输入语言无关，低层的中间代码</span><span style="font-size: 9pt; color: maroon;">(Low level IR)</span><span style="font-size: 9pt; color: maroon;">与机器语言类似。</span><span style="font-size: 9pt; color: maroon;">不同的分析，优化发生在最适合的那一层中间代码上。</span><span style="font-size: 9pt; color: maroon;"><br><br>&nbsp;</span><span style="font-size: 9pt; color: maroon;">常见的编译分析有函数调用树（</span><span style="font-size: 9pt; color: maroon;">call tree</span><span style="font-size: 9pt; color: maroon;">），控制流程图（</span><span style="font-size: 9pt; color: maroon;">Control flow graph</span><span style="font-size: 9pt; color: maroon;">），以及在此基础上的</span><span style="font-size: 9pt; color: maroon;">变量定义－使用，使用－定义链（</span><span style="font-size: 9pt; color: maroon;">define-use/use-define or u-d/d-u chain</span><span style="font-size: 9pt; color: maroon;">），变量别名分析（</span><span style="font-size: 9pt; color: maroon;">alias analysis</span><span style="font-size: 9pt; color: maroon;">），指针分析（</span><span style="font-size: 9pt; color: maroon;">pointer analysis</span><span style="font-size: 9pt; color: maroon;">），数据依赖分析（</span><span style="font-size: 9pt; color: maroon;">data dependence analysis</span><span style="font-size: 9pt; color: maroon;">）等等。</span><span style="font-size: 9pt; color: maroon;"><br><br>&nbsp;</span><span style="font-size: 9pt; color: maroon;">上述的程序分析结果是编译器优化（</span><span style="font-size: 9pt; color: maroon;">compiler optimization</span><span style="font-size: 9pt; color: maroon;">）和程序变形（</span><span style="font-size: 9pt; color: maroon;">compiler transformation</span><span style="font-size: 9pt; color: maroon;">）的前提条件。常见的优化和变新有：函数内嵌（</span><span style="font-size: 9pt; color: maroon;">inlining</span><span style="font-size: 9pt; color: maroon;">），无用代码删除（</span><span style="font-size: 9pt; color: maroon;">Dead code elimination</span><span style="font-size: 9pt; color: maroon;">），标准化循环结构（</span><span style="font-size: 9pt; color: maroon;">loop normalization</span><span style="font-size: 9pt; color: maroon;">），循环体展开（</span><span style="font-size: 9pt; color: maroon;">loop unrolling</span><span style="font-size: 9pt; color: maroon;">），循环体合并，分裂（</span><span style="font-size: 9pt; color: maroon;">loop fusion</span><span style="font-size: 9pt; color: maroon;">，</span><span style="font-size: 9pt; color: maroon;">loop fission</span><span style="font-size: 9pt; color: maroon;">），数组填充（</span><span style="font-size: 9pt; color: maroon;">array padding</span><span style="font-size: 9pt; color: maroon;">），等等。</span><span style="font-size: 9pt; color: maroon;">优化和变形的目的是减少代码的长度，提高内存（</span><span style="font-size: 9pt; color: maroon;">memory</span><span style="font-size: 9pt; color: maroon;">），缓存（</span><span style="font-size: 9pt; color: maroon;">cache</span><span style="font-size: 9pt; color: maroon;">）的使用率，减少读写磁盘，访问网络数据的频率。更高级的优化甚至可以把序列化的代码（</span><span style="font-size: 9pt; color: maroon;">serial code</span><span style="font-size: 9pt; color: maroon;">）变成并行运算，多线程的代码（</span><span style="font-size: 9pt; color: maroon;">parallelized</span><span style="font-size: 9pt; color: maroon;">，</span><span style="font-size: 9pt; color: maroon;">multi-threaded code</span><span style="font-size: 9pt; color: maroon;">）。</span><span style="font-size: 9pt; color: maroon;"><br><br>&nbsp;</span><span style="font-size: 9pt; color: maroon;">机器代码的生成是优化变型后的中间代码转换成机器指令的过程。现代编译器主要采用生成汇编代码（</span><span style="font-size: 9pt; color: maroon;">assembly code</span><span style="font-size: 9pt; color: maroon;">）的策略，而不直接生成二进制的目标代码（</span><span style="font-size: 9pt; color: maroon;">binary object code</span><span style="font-size: 9pt; color: maroon;">）。即使在代码生成阶段，高级编译器仍然要做很多分析，优化，变形的工作。例如如何分配寄存器（</span><span style="font-size: 9pt; color: maroon;">register allocatioin</span><span style="font-size: 9pt; color: maroon;">），如何选择合适的机器指令（</span><span style="font-size: 9pt; color: maroon;">instruction selection</span><span style="font-size: 9pt; color: maroon;">），如何合并几句代码成一句等等。</span><span style="font-size: 9pt; color: maroon;"><br><br><br></span><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;">编译语言与解释语言对比</span></strong><strong><span style="background: #d9d9d9 none repeat scroll 0% 50%; font-size: 9pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; color: maroon;"><br>&nbsp;</span></strong><span style="font-size: 9pt; color: maroon;">许多人将高级程序语言分为两类</span><span style="font-size: 9pt; color: maroon;">: </span><span style="font-size: 9pt; color: maroon;">编译型语言</span><span style="font-size: 9pt; color: maroon;">和</span><span style="font-size: 9pt; color: maroon;">解释型语言</span><span style="font-size: 9pt; color: maroon;">。然而，实际上，这些语言中的大多数既可用编译型实现也可用解释型实现，分类实际上反映的是那种语言常见的实现方式。（但是，某些解释型语言，很难用编译型实现。比如那些允许</span><span style="font-size: 9pt; color: maroon;">在线代码更改</span><span style="font-size: 9pt; color: maroon;">的解释型语言。）</span></div><img src ="http://www.cppblog.com/beautykingdom/aggbug/56722.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.cppblog.com/beautykingdom/" target="_blank">chatler</a> 2008-07-21 08:57 <a href="http://www.cppblog.com/beautykingdom/archive/2008/07/21/56722.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>