近日逛sf.net的时候发现了UCC,国人写的c编译器,那是相当经典的说,作者也很低调,连名字和联系方式都没有留下。
google了一下,只找到下面的一点介绍:

介绍来自:http://bbs.ustc.edu.cn/cgi-bin/bbscon?bn=CSArch&fn=M48291327
然上了一学期的编译原理,但是对于如何去实现一个真正的编译器仍然觉得困惑;
学习了一些好的优化算法或者自己有些好的想法,想在gcc上实践一下,但发现gcc
实在太大了,有点无从下手。
如果你曾经有过上面这些感受,也许可以尝试一下ucc。

ucc是一款遵从ANSI C89标准的编译器,大约15,000行C代码。目前支持x86平台上的
Linux和Windows系统,能正确编译自身并成功运行。它有下面一些特点:

1. 代码结构清晰直观,有比较详细的中文文档讲述它的实现
2. 使用三地址码作为中间码,构建了由基本块组成的控制流图,适合很多优化算法
3. 编译速度快。词法分析,语法分析和目标代码生成器都是手写的(其中的代码
   生成器本想用burg这样的工具自动生成,但这样可能会给代码的理解带来难度,
   最后手写了一个简单的代码生成器)

你可以从http://sourceforge.net/projects/ucc
下载到它的软件包,希望对大家学习编译器有所帮助。

考虑到部分人无法访问sf.net,我随手传上来了,又一份经典的编译学习代码,哈哈
下载地址:http://www.cppblog.com/Files/ngaut/ucc160.zip
另外附上c语言的文法,貌似是c89的

<translation-unit> ::= {<external-declaration>}*

<external-declaration> ::= <function-definition>
| <declaration>

<function-definition> ::= {<declaration-specifier>}* <declarator> {<declaration>}* <compound-statement>

<declaration-specifier> ::= <storage-class-specifier>
| <type-specifier>
| <type-qualifier>

<storage-class-specifier> ::= "auto"
| "register"
| "static"
| "extern"
| "typedef"

<type-specifier> ::= "void"
| "char"
| "short"
| "int"
| "long"
| "float"
| "double"
| "signed"
| "unsigned"
| <struct-or-union-specifier>
| <enum-specifier>
| <typedef-name>

<struct-or-union-specifier> ::= <struct-or-union> <identifier> "{" {<struct-declaration>}+ "}"
| <struct-or-union> "{" {<struct-declaration>}+ "}"
| <struct-or-union> <identifier>

<struct-or-union> ::= "struct"
| "union"

<struct-declaration> ::= {<specifier-qualifier>}* <struct-declarator-list>

<specifier-qualifier> ::= <type-specifier>
| <type-qualifier>

<struct-declarator-list> ::= <struct-declarator>
| <struct-declarator-list> "," <struct-declarator>

<struct-declarator> ::= <declarator>
| <declarator> ":" <constant-expression>
| ":" <constant-expression>

<declarator> ::= {<pointer>}? <direct-declarator>

<pointer> ::= "*" {<type-qualifier>}* {<pointer>}?

<type-qualifier> ::= "const"
| "volatile"

<direct-declarator> ::= <identifier>
| "(" <declarator> ")"
| <direct-declarator> "[" {<constant-expression>}? "]"
| <direct-declarator> "(" <parameter-type-list> ")"
| <direct-declarator> "(" {<identifier>}* ")"

<constant-expression> ::= <conditional-expression>

<conditional-expression> ::= <logical-or-expression>
| <logical-or-expression> "?" <expression> ":" <conditional-expression>

<logical-or-expression> ::= <logical-and-expression>
| <logical-or-expression "||" <logical-and-expression>

<logical-and-expression> ::= <inclusive-or-expression>
| <logical-and-expression "&&" <inclusive-or-expression>

<inclusive-or-expression> ::= <exclusive-or-expression>
| <inclusive-or-expression> "|" <exclusive-or-expression>

<exclusive-or-expression> ::= <and-expression>
| <exclusive-or-expression> "^" <and-expression>

<and-expression> ::= <equality-expression>
| <and-expression> "&" <equality-expression>

<equality-expression> ::= <relational-expression>
| <equality-expression> "==" <relational-expression>
| <equality-expression> "!=" <relational-expression>

<relational-expression> ::= <shift-expression>
| <relational-expression> "<" <shift-expression>
| <relational-expression> ">" <shift-expression>
| <relational-expression> "<=" <shift-expression>
| <relational-expression> ">=" <shift-expression>

<shift-expression> ::= <additive-expression>
| <shift-expression> "<<" <additive-expression>
| <shift-expression> ">>" <additive-expression>

<additive-expression> ::= <multiplicative-expression>
| <additive-expression> "+" <multiplicative-expression>
| <additive-expression> "-" <multiplicative-expression>

<multiplicative-expression> ::= <cast-expression>
| <multiplicative-expression> "*" <cast-expression>
| <multiplicative-expression> "/" <cast-expression>
| <multiplicative-expression> "%" <cast-expression>

<cast-expression> ::= <unary-expression>
| "(" <type-name> ")" <cast-expression>

<unary-expression> ::= <postfix-expression>
| "++" <unary-expression>
| "--" <unary-expression>
| <unary-operator> <cast-expression>
| "sizeof" <unary-expression>
| "sizeof" <type-name>

<postfix-expression> ::= <primary-expression>
| <postfix-expression> "[" <expression> "]"
| <postfix-expression> "(" {<assignment-expression>}* ")"
| <postfix-expression> "." <identifier>
| <postfix-expression> "->" <identifier>
| <postfix-expression> "++"
| <postfix-expression> "--"

<primary-expression> ::= <identifier>
| <constant>
| <string>
| "(" <expression> ")"

<constant> ::= <integer-constant>
| <character-constant>
| <floating-constant>
| <enumeration-constant>

<expression> ::= <assignment-expression>
| <expression> "," <assignment-expression>

<assignment-expression> ::= <conditional-expression>
| <unary-expression> <assignment-operator> <assignment-expression>

<assignment-operator> ::= "="
| "*="
| "/="
| "%="
| "+="
| "-="
| "<<="
| ">>="
| "&="
| "^="
| "|="

<unary-operator> ::= "&"
| "*"
| "+"
| "-"
| "~"
| "!"

<type-name> ::= {<specifier-qualifier>}+ {<abstract-declarator>}?

<parameter-type-list> ::= <parameter-list>
| <parameter-list> "," ...

<parameter-list> ::= <parameter-declaration>
| <parameter-list> "," <parameter-declaration>

<parameter-declaration> ::= {<declaration-specifier>}+ <declarator>
| {<declaration-specifier>}+ <abstract-declarator>
| {<declaration-specifier>}+

<abstract-declarator> ::= <pointer>
| <pointer> <direct-abstract-declarator>
| <direct-abstract-declarator>

<direct-abstract-declarator> ::= ( <abstract-declarator> )
| {<direct-abstract-declarator>}? "[" {<constant-expression>}? "]"
| {<direct-abstract-declarator>}? "(" {<parameter-type-list>|? ")"

<enum-specifier> ::= "enum" <identifier> "{" <enumerator-list> "}"
| "enum" "{" <enumerator-list> "}"
| "enum" <identifier>

<enumerator-list> ::= <enumerator>
| <enumerator-list> "," <enumerator>

<enumerator> ::= <identifier>
| <identifier> "=" <constant-expression>

<typedef-name> ::= <identifier>

<declaration> ::= {<declaration-specifier>}+ {<init-declarator>}*

<init-declarator> ::= <declarator>
| <declarator> "=" <initializer>

<initializer> ::= <assignment-expression>
| "{" <initializer-list> "}"
| "{" <initializer-list> "," "}"

<initializer-list> ::= <initializer>
| <initializer-list> "," <initializer>

<compound-statement> ::= "{" {<declaration>}* {<statement>}* "}"

<statement> ::= <labeled-statement>
| <expression-statement>
| <compound-statement>
| <selection-statement>
| <iteration-statement>
| <jump-statement>

<labeled-statement> ::= <identifier> ":" < statement>
| "case" <constant-expression> ":" <statement>
| "default" ":" <statement>

<expression-statement> ::= {<expression>}? ";"

<selection-statement> ::= "if" "(" <expression> ")" <statement>
| "if" "(" <expression> ")" <statement> "else" <statement>
| "switch" "(" <expression> ")" <statement>

<iteration-statement> ::= "while" "(" <expression> ")" <statement>
| "do" <statement> "while" "(" <expression> ")" ";"
| "for" "(" {<expression>}? ";" {<expression>}? ";" {<expression>}? ")" <statement>

<jump-statement> ::= "goto" <identifier> ";"
| "continue" ";"
| "break" ";"
| "return" {<expression>}? ";"

from http://www.cppblog.com/ngaut

This grammar was adapted from the one in section A13 of The C programming language, second edition, by Brian W. Kernighan and Dennis M. Ritchie (Englewood Cliffs, New Jersey: Prentice Hall PTR, 1988; ISBN 0-13-110362-8), pages 234-238.