随笔-341  评论-2670  文章-0  trackbacks-0
    我们知道Yacc和Bison都是产生C++的代码作为编译器的前端的。但是有时候我们需要动态地产生一个编译器前端,极端一点讲,譬如“文法调试器”。调试器总不能动态生成.y文件,让yacc编译,让gcc再度编译,然后execute,最后将程序的输出结果读进来。这样就太麻烦了,于是我们需要重新写一个生成编译器前端的程序。

    趁着今天有空,我写了一段代码用来测试Syngram是否能很好的动态生成语法分析器。Syngram设计的初衷是让C++可以直接写文法,不过通过今天的实验发现Syngram也是可以很好的处理动态输入的东西的,譬如说动态的文法、动态的错误处理程序以及动态的语义处理函数等等。作为一个编译器生成器的实验,能否处理Syngram所有的结果是一件很值得注意的事情。

    这个程序仍然使用了上一篇文章所描述的Algorithm架构来处理文法。程序读入两个文件,一个是文法描述文件,另一个是被这个文法处理的字符串文件。程序使用Syngram处理第一个文件,将文件内容动态转换为Syngram能够接受的文法描述并产生第二个文法分析器,然后使用这个分析器来处理第二个文件。

    我们给出科学计算器所接受的文法:
 1 lexical
 2 {
 3     num='\d+(.\d+)?'
 4     ident='[a-zA-Z_]\w*'
 5     plus='\+'
 6     minus='\-'
 7     mul='\*'
 8     div='/'
 9     leftbrace='\('
10     rightbrace='\)'
11     comma=','
12     discard='\s+'
13 }
14 rule
15 {
16     factor=num                    ;
17     factor=[minus] factor                ;
18     factor=leftbrace exp rightbrace            ;
19     factor=ident[leftbrace param_list rightbrace]    ;
20     term=factor                    ;
21     term=term (mul|div) factor            ;
22     exp=term                    ;
23     exp=exp (plus|minus) term            ;
24     param_list=exp [comma param_list]        ;
25     program=exp                    ;
26 }

    如果我们把factor=leftbrace exp rightbrace修改为factor=leftbrace | exp | rightbrace的话,由于出现了exp->term->factor->exp的无效循环引用,Syngram会报告文法的错误:
 1 规则<factor>存在长度为3的间接左递归。
 2 规则<exp>存在长度为3的间接左递归。
 3 规则<term>存在长度为3的间接左递归。
 4 规则<program>[0]不能产生字符串。
 5 规则<exp>[0]不能产生字符串。
 6 规则<exp>[1]不能产生字符串。
 7 规则<param_list>[0]不能产生字符串。
 8 规则<term>[0]不能产生字符串。
 9 规则<term>[1]不能产生字符串。
10 

    <term>[0]的意思是term的第1条推导式子。

    好了,现在我们让他读入一个表达式来进行测试:
1 (1+sin(2))**(sin(log(2,3))+4)
    这个表达式明显是错误的,因为中间有两个乘号。于是Syngram就会报告错误:
1 --------------------------------------------------------------------------------
2 (1+sin(2))**(sin(log(2,3))+4)
3 --------------------------------------------------------------------------------
4 遭遇第9个记号,类型5,状态<term>[1=  term[0] ( mul[1| div[2] )* factor[3] 。
    第9个记号指的就是第二个乘号了。因为(1+sin(2))是一个factor,factor是一个term,因此term=term (*|/) factor走到了factor那里发现输入的竟然是乘号,于是报错。

    好了。我们把其中一个乘号去掉,现在Syngram就会给出分析的结果了:
 1 --------------------------------------------------------------------------------
 2 (1+sin(2))*(sin(log(2,3))+4)
 3 --------------------------------------------------------------------------------
 4 program = {
 5   exp = {
 6     term = {
 7       term = {
 8         factor = {
 9           leftbrace = (
10           exp = {
11             exp = {
12               term = {
13                 factor = {
14                   num = 1
15                 }
16               }
17             }
18             plus = +
19             term = {
20               factor = {
21                 ident = sin
22                 leftbrace = (
23                 param_list = {
24                   exp = {
25                     term = {
26                       factor = {
27                         num = 2
28                       }
29                     }
30                   }
31                 }
32                 rightbrace = )
33               }
34             }
35           }
36           rightbrace = )
37         }
38       }
39       mul = *
40       factor = {
41         leftbrace = (
42         exp = {
43           exp = {
44             term = {
45               factor = {
46                 ident = sin
47                 leftbrace = (
48                 param_list = {
49                   exp = {
50                     term = {
51                       factor = {
52                         ident = log
53                         leftbrace = (
54                         param_list = {
55                           exp = {
56                             term = {
57                               factor = {
58                                 num = 2
59                               }
60                             }
61                           }
62                           comma = ,
63                           param_list = {
64                             exp = {
65                               term = {
66                                 factor = {
67                                   num = 3
68                                 }
69                               }
70                             }
71                           }
72                         }
73                         rightbrace = )
74                       }
75                     }
76                   }
77                 }
78                 rightbrace = )
79               }
80             }
81           }
82           plus = +
83           term = {
84             factor = {
85               num = 4
86             }
87           }
88         }
89         rightbrace = )
90       }
91     }
92   }
93 }
94 


    东西这么长是因为一个数字"1"要经过exp->term->factor->num来到达。这里表示的仅仅是文法的推导过程,实际的文法经常使用一组继承的数据结构来消除这些多余的数据,譬如说表达式类ExpressionBase的子类经常会出现NumberExpression、BinaryOperatorExpression、MethodInvokeExpression等等。这些结构消除的就是【一个数字"1"要经过exp->term->factor->num来到达】这种类型的复杂度了。

    现在给出程序的代码。一部分代码已经在上一篇文章中提供了。首先是main函数:

  1 #include "..\..\..\Library\Platform\VL_Console.h"
  2 #include "..\..\..\Library\Data\VL_Stream.h"
  3 #include "..\..\..\Library\Data\VL_System.h"
  4 #include "..\Utility\GrammarTree.h"
  5 #include "..\Utility\GrammarAlgorithms.h"
  6 
  7 using namespace vl;
  8 using namespace vl::platform;
  9 using namespace vl::stream;
 10 using namespace vl::system;
 11 using namespace compiler;
 12 
 13 void vlmain(VL_Console& Con)
 14 {
 15     Con.SetTitle(L"Compile Syntax/Semantic Builder");
 16     Con.SetTestMemoryLeaks(true);
 17     Con.SetPauseOnExit(true);
 18 
 19     VUnicodeString TestDataPath=VFileName(Con.GetAppPath()).MakeAbsolute(L"..\\TestData\\").GetStrW();
 20     VUnicodeString GrammarText;
 21     VL_TextInput(new VL_FileInputStream(TestDataPath+L"计算器文法.txt"),true,vceBOM).Read(GrammarText);
 22 
 23     GrammarProvider Provider;
 24     GrammarDescription::Ptr gd=Provider.Parse(GrammarText);
 25     if(Provider.GetErrors().GetCount()==0)
 26     {
 27         {
 28             GrammarValidateParam ValidateParam;
 29             ValidateParam.Structure=new GrammarStructure;
 30             if(!GrammarValidate().Apply(gd,ValidateParam))
 31             {
 32                 for(VInt i=0;i<ValidateParam.Structure->Errors.GetCount();i++)
 33                 {
 34                     GrammarValidateError Error=ValidateParam.Structure->Errors[i];
 35                     VUnicodeString Line;
 36                     if(Error.TokenError)
 37                     {
 38                         Line+=L"词法错误,";
 39                     }
 40                     else if(Error.RuleError)
 41                     {
 42                         Line+=L"文法错误,";
 43                     }
 44                     else
 45                     {
 46                         Line+=L"未知错误,";
 47                     }
 48                     Line+=L"位置:第"+VUnicodeString(Error.Index)+L"个定义,";
 49                     Line+=Error.Message;
 50                     Con.Write(Line+L"\r\n");
 51                 }
 52                 return;
 53             }
 54         }
 55         GrammarSimulateParam SimulateParam;
 56         {
 57             SimulateParam.Simulator=new GrammarSimulator;
 58             GrammarSimulate Algorithm;
 59             Algorithm.Apply(gd,&SimulateParam);
 60             if(SimulateParam.ErrorMessages.GetCount())
 61             {
 62                 for(VInt i=0;i<SimulateParam.ErrorMessages.GetCount();i++)
 63                 {
 64                     Con.Write(SimulateParam.ErrorMessages[i]+L"\r\n");
 65                 }
 66                 return;
 67             }
 68         }
 69         {
 70             Con.Write(GrammarToCode().Apply(gd,L""));
 71             Con.Write(L"--------------------------------------------------------------------------------");
 72             VUnicodeString InputExpression=L"";
 73             VL_TextInput(new VL_FileInputStream(TestDataPath+L"计算器表达式.txt"),true,vceBOM).Read(InputExpression);
 74             Con.Write(InputExpression+L"\r\n");
 75             VL_LexerFactoryPtr LexerResult;
 76             GrammarSimulatorNode::List SynerResult;
 77             SimulateParam.Simulator->Parse(InputExpression,LexerResult,SynerResult);
 78             if(SimulateParam.Simulator->ParseErrors.GetCount())
 79             {
 80                 Con.Write(L"--------------------------------------------------------------------------------");
 81                 for(VInt i=0;i<SimulateParam.Simulator->ParseErrors.GetCount();i++)
 82                 {
 83                     Con.Write(SimulateParam.Simulator->ParseErrors[i].Message+L"\r\n");
 84                 }
 85             }
 86             for(VInt i=0;i<SynerResult.GetCount();i++)
 87             {
 88                 Con.Write(L"--------------------------------------------------------------------------------");
 89                 Con.Write(SynerResult[i]->ToString(L""));
 90             }
 91         }
 92     }
 93     else
 94     {
 95         for(VInt i=0;i<Provider.GetErrors().GetCount();i++)
 96         {
 97             Con.Write(Provider.GetErrors()[i].Message+L"\r\n");
 98         }
 99     }
100 }

    其次是四个算法,分别用于将文法的语法树转换为字符串、将语法树转换为文法文件的字符串、检查文法依赖关系以及检查文法的逻辑关系。这同时证明了上一篇文章的Algorithm架构是可以在心理上帮助程序员写出独立性较强的算法的,大大降低了维护的难度。

    头文件:
  1 #ifndef GRAMMARALGORITHM
  2 #define GRAMMARALGORITHM
  3 
  4 #include "GrammarTree.h"
  5 #include "..\..\..\Library\Data\Data\VL_Data_Map.h"
  6 #include "..\..\..\Library\Data\Grammar2\VL_SynTools.h"
  7 
  8 namespace compiler
  9 {
 10     using namespace grammar;
 11 
 12     class GrammarValidateError : public VL_Base
 13     {
 14     public:
 15         typedef VL_List<GrammarValidateError , false>    List;
 16 
 17         VBool                TokenError;
 18         VBool                RuleError;
 19         VInt                Index;
 20         VUnicodeString        Message;
 21     };
 22 
 23 /*********************************************************************************************************
 24 GrammarToString
 25 *********************************************************************************************************/
 26 
 27     class GrammarToString : public GrammarAlgorithmEx<VUnicodeString , VUnicodeString>
 28     {
 29     public:
 30         VUnicodeString            Visit(GrammarBranch* Obj , VUnicodeString Prefix);
 31         VUnicodeString            Visit(GrammarSequence* Obj , VUnicodeString Prefix);
 32         VUnicodeString            Visit(GrammarOptional* Obj , VUnicodeString Prefix);
 33         VUnicodeString            Visit(GrammarUnit* Obj , VUnicodeString Prefix);
 34         VUnicodeString            Visit(GrammarRule* Obj , VUnicodeString Prefix);
 35         VUnicodeString            Visit(LexicalDecl* Obj , VUnicodeString Prefix);
 36         VUnicodeString            Visit(GrammarDescription* Obj , VUnicodeString Prefix);
 37     };
 38 
 39 /*********************************************************************************************************
 40 GrammarToCode
 41 *********************************************************************************************************/
 42 
 43     class GrammarToCode : public GrammarAlgorithmEx<VUnicodeString , VUnicodeString>
 44     {
 45     public:
 46         VUnicodeString            Visit(GrammarBranch* Obj , VUnicodeString Prefix);
 47         VUnicodeString            Visit(GrammarSequence* Obj , VUnicodeString Prefix);
 48         VUnicodeString            Visit(GrammarOptional* Obj , VUnicodeString Prefix);
 49         VUnicodeString            Visit(GrammarUnit* Obj , VUnicodeString Prefix);
 50         VUnicodeString            Visit(GrammarRule* Obj , VUnicodeString Prefix);
 51         VUnicodeString            Visit(LexicalDecl* Obj , VUnicodeString Prefix);
 52         VUnicodeString            Visit(GrammarDescription* Obj , VUnicodeString Prefix);
 53     };
 54 
 55 /*********************************************************************************************************
 56 GrammarValidate
 57 *********************************************************************************************************/
 58 
 59     class GrammarStructure : public VL_Base
 60     {
 61     public:
 62         typedef VL_ListedMap<VUnicodeString , VL_AutoPtr<VL_List<VInt , true>>>    _MultiIntMap;
 63         typedef VL_ListedMap<VUnicodeString , VInt>                                _IntMap;
 64     public:
 65         _IntMap                        Tokens;
 66         _MultiIntMap                Rules;
 67         GrammarValidateError::List    Errors;
 68     };
 69     class GrammarValidateParam : public VL_Base
 70     {
 71     public:
 72         VL_AutoPtr<GrammarStructure>    Structure;
 73         VInt                            Index;
 74     };
 75     class GrammarValidate : public GrammarAlgorithmEx<VBool , GrammarValidateParam>
 76     {
 77     public:
 78         VBool                    Visit(GrammarBranch* Obj , GrammarValidateParam Param);
 79         VBool                    Visit(GrammarSequence* Obj , GrammarValidateParam Param);
 80         VBool                    Visit(GrammarOptional* Obj , GrammarValidateParam Param);
 81         VBool                    Visit(GrammarUnit* Obj , GrammarValidateParam Param);
 82         VBool                    Visit(GrammarRule* Obj , GrammarValidateParam Param);
 83         VBool                    Visit(LexicalDecl* Obj , GrammarValidateParam Param);
 84         VBool                    Visit(GrammarDescription* Obj , GrammarValidateParam Param);
 85     };
 86 
 87 /*********************************************************************************************************
 88 GrammarSimulate
 89 *********************************************************************************************************/
 90 
 91     class GrammarSimulatorNode : public VL_Base
 92     {
 93     public:
 94         typedef VL_AutoPtr<GrammarSimulatorNode>                Ptr;
 95         typedef VL_List<Ptr , false , GrammarSimulatorNode*>    List;
 96 
 97         VUnicodeString            TerminatorName;
 98         VUnicodeString            Value;
 99         List                    SubExpressions;
100 
101         VUnicodeString            ToString(VUnicodeString Prefix);
102     };
103     class GrammarSimulator : public VL_Base
104     {
105     public:
106         typedef VL_SynRuleHandler<GrammarSimulatorNode::Ptr>            RuleTransformer;
107         typedef RuleTransformer::Handler                                RuleHandler;
108         typedef VL_List<VL_AutoPtr<RuleHandler> , false , RuleHandler*>    RuleHandlerList;
109         typedef VL_ListedMap<VInt , VUnicodeString>                        TokenIDMap;
110     public:
111         typedef VL_AutoPtr<GrammarSimulator>                    Ptr;
112         typedef VL_AutoPtr<VL_LexerHandler>                        LexerHandler;
113 
114         VL_Lexer                Lexer;
115         VL_Syner                Syner;
116         RuleTransformer            Transformer;
117         RuleHandlerList            Handlers;
118         TokenIDMap                TokenIDs;
119         GrammarError::List        ParseErrors;
120         LexerHandler            LexerErrorHandler;
121 
122         GrammarSimulator();
123 
124         VInt                    Parse(VUnicodeString Input , VL_LexerFactoryPtr& LexerResult , GrammarSimulatorNode::List& Result);
125     };
126     class GrammarSimulateParam : public VL_Base
127     {
128         typedef VL_ListedMap<VUnicodeString , VSynTerm>    TermMap;
129         typedef VL_List<VUnicodeString , false>            StringList;
130     public:
131         GrammarSimulator::Ptr    Simulator;
132         TermMap                    Terms;
133         VInt                    UsedStorageIndex;
134         StringList                ErrorMessages;
135     };
136     class GrammarSimulate : public GrammarAlgorithmEx<VSynTerm , GrammarSimulateParam*>
137     {
138 
139         VSynTerm                Visit(GrammarBranch* Obj , GrammarSimulateParam* Param);
140         VSynTerm                Visit(GrammarSequence* Obj , GrammarSimulateParam* Param);
141         VSynTerm                Visit(GrammarOptional* Obj , GrammarSimulateParam* Param);
142         VSynTerm                Visit(GrammarUnit* Obj , GrammarSimulateParam* Param);
143         VSynTerm                Visit(GrammarRule* Obj , GrammarSimulateParam* Param);
144         VSynTerm                Visit(LexicalDecl* Obj , GrammarSimulateParam* Param);
145         VSynTerm                Visit(GrammarDescription* Obj , GrammarSimulateParam* Param);
146     };
147 }
148 
149 #endif

    实现:
  1 #include "GrammarAlgorithms.h"
  2 #include "..\..\..\Library\Data\Grammar2\VL_Regexp.h"
  3 
  4 using namespace vl::grammar;
  5 
  6 namespace compiler
  7 {
  8 
  9 /*********************************************************************************************************
 10 GrammarToString
 11 *********************************************************************************************************/
 12 
 13     VUnicodeString GrammarToString::Visit(GrammarBranch* Obj , VUnicodeString Prefix)
 14     {
 15         VUnicodeString Result;
 16         Result+=Prefix+L"branch {\r\n";
 17         for(VInt i=0;i<Obj->Expressions.GetCount();i++)
 18         {
 19             Result+=Apply(Obj->Expressions[i],Prefix+L"  ");
 20         }
 21         Result+=Prefix+L"}\r\n";
 22         return Result;
 23     }
 24 
 25     VUnicodeString GrammarToString::Visit(GrammarSequence* Obj , VUnicodeString Prefix)
 26     {
 27         VUnicodeString Result;
 28         Result+=Prefix+L"sequence {\r\n";
 29         for(VInt i=0;i<Obj->Expressions.GetCount();i++)
 30         {
 31             Result+=Apply(Obj->Expressions[i],Prefix+L"  ");
 32         }
 33         Result+=Prefix+L"}\r\n";
 34         return Result;
 35     }
 36 
 37     VUnicodeString GrammarToString::Visit(GrammarOptional* Obj , VUnicodeString Prefix)
 38     {
 39         VUnicodeString Result;
 40         Result+=Prefix+L"optional {\r\n";
 41         Result+=Apply(Obj->Expression,Prefix+L"  ");
 42         Result+=Prefix+L"}\r\n";
 43         return Result;
 44     }
 45 
 46     VUnicodeString GrammarToString::Visit(GrammarUnit* Obj , VUnicodeString Prefix)
 47     {
 48         return Prefix+Obj->Name+L"\r\n";
 49     }
 50 
 51     VUnicodeString GrammarToString::Visit(GrammarRule* Obj , VUnicodeString Prefix)
 52     {
 53         VUnicodeString Result;
 54         Result+=Prefix+L"rule {\r\n";
 55         Result+=Prefix+L"  "+Obj->Name+L"\r\n";
 56         Result+=Apply(Obj->Expression,Prefix+L"  ");
 57         Result+=Prefix+L"}\r\n";
 58         return Result;
 59     }
 60 
 61     VUnicodeString GrammarToString::Visit(LexicalDecl* Obj , VUnicodeString Prefix)
 62     {
 63         VUnicodeString Result;
 64         Result+=Prefix+L"lexical inference {\r\n";
 65         Result+=Prefix+L"  "+Obj->Name+L"\r\n";
 66         Result+=Prefix+L"  "+Obj->ProcessedRegex+L"\r\n";
 67         Result+=Prefix+L"}\r\n";
 68         return Result;
 69     }
 70 
 71     VUnicodeString GrammarToString::Visit(GrammarDescription* Obj , VUnicodeString Prefix)
 72     {
 73         VUnicodeString Result;
 74         Result+=Prefix+L"lexical inferences {\r\n";
 75         for(VInt i=0;i<Obj->Tokens.GetCount();i++)
 76         {
 77             Result+=Apply(Obj->Tokens[i],Prefix+L"  ");
 78         }
 79         Result+=Prefix+L"}\r\n";
 80         Result+=Prefix+L"syntax inferences {\r\n";
 81         for(VInt i=0;i<Obj->Rules.GetCount();i++)
 82         {
 83             Result+=Apply(Obj->Rules[i],Prefix+L"  ");
 84         }
 85         Result+=Prefix+L"}\r\n";
 86         return Result;
 87     }
 88 
 89 /*********************************************************************************************************
 90 GrammarToCode
 91 *********************************************************************************************************/
 92 
 93     VUnicodeString GrammarToCode::Visit(GrammarBranch* Obj , VUnicodeString Prefix)
 94     {
 95         VUnicodeString Result;
 96         for(VInt i=0;i<Obj->Expressions.GetCount();i++)
 97         {
 98             if(i)Result+=L" | ";
 99             Result+=Apply(Obj->Expressions[i],L"");
100         }
101         return Prefix+L"("+Result+L")";
102     }
103 
104     VUnicodeString GrammarToCode::Visit(GrammarSequence* Obj , VUnicodeString Prefix)
105     {
106         VUnicodeString Result;
107         for(VInt i=0;i<Obj->Expressions.GetCount();i++)
108         {
109             if(i)Result+=L" ";
110             Result+=Apply(Obj->Expressions[i],L"");
111         }
112         return Prefix+L"("+Result+L")";
113     }
114 
115     VUnicodeString GrammarToCode::Visit(GrammarOptional* Obj , VUnicodeString Prefix)
116     {
117         return Prefix+L"["+Apply(Obj->Expression,L"")+L"]";
118     }
119 
120     VUnicodeString GrammarToCode::Visit(GrammarUnit* Obj , VUnicodeString Prefix)
121     {
122         return Prefix+Obj->Name;
123     }
124 
125     VUnicodeString GrammarToCode::Visit(GrammarRule* Obj , VUnicodeString Prefix)
126     {
127         return Prefix+Obj->Name+L" = "+Apply(Obj->Expression,L"");
128     }
129 
130     VUnicodeString GrammarToCode::Visit(LexicalDecl* Obj , VUnicodeString Prefix)
131     {
132         return Prefix+Obj->Name+L" = "+Obj->RegularExpression;
133     }
134 
135     VUnicodeString GrammarToCode::Visit(GrammarDescription* Obj , VUnicodeString Prefix)
136     {
137         VUnicodeString Result;
138         Result+=Prefix+L"lexical\r\n{\r\n";
139         for(VInt i=0;i<Obj->Tokens.GetCount();i++)
140         {
141             Result+=Apply(Obj->Tokens[i],L"  ")+L"\r\n";
142         }
143         Result+=Prefix+L"}\r\n";
144         Result+=Prefix+L"syntax\r\n{\r\n";
145         for(VInt i=0;i<Obj->Rules.GetCount();i++)
146         {
147             Result+=Apply(Obj->Rules[i],L"  ")+L"\r\n";
148         }
149         Result+=Prefix+L"}\r\n";
150         return Result;
151     }
152 
153 /*********************************************************************************************************
154 GrammarValidate
155 *********************************************************************************************************/
156 
157     VBool GrammarValidate::Visit(GrammarBranch* Obj , GrammarValidateParam Param)
158     {
159         VBool Result=true;
160         for(VInt i=0;i<Obj->Expressions.GetCount();i++)
161         {
162             Result=Apply(Obj->Expressions[i],Param) && Result;
163         }
164         return Result;
165     }
166 
167     VBool GrammarValidate::Visit(GrammarSequence* Obj , GrammarValidateParam Param)
168     {
169         VBool Result=true;
170         for(VInt i=0;i<Obj->Expressions.GetCount();i++)
171         {
172             Result=Apply(Obj->Expressions[i],Param) && Result;
173         }
174         return Result;
175     }
176 
177     VBool GrammarValidate::Visit(GrammarOptional* Obj , GrammarValidateParam Param)
178     {
179         return Apply(Obj->Expression,Param);
180     }
181 
182     VBool GrammarValidate::Visit(GrammarUnit* Obj , GrammarValidateParam Param)
183     {
184         if(Obj->Name==L"discard")
185         {
186             GrammarValidateError Error;
187             Error.Index=Param.Index;
188             Error.RuleError=true;
189             Error.TokenError=false;
190             Error.Message=L"不能在文法中使用词法记号\"discard\",\"discard\"只能被用来定义不需要的词法记号。";
191             Param.Structure->Errors.Add(Error);
192             return false;
193         }
194         else if(Param.Structure->Rules.Exists(Obj->Name) || Param.Structure->Tokens.Exists(Obj->Name))
195         {
196             return true;
197         }
198         else
199         {
200             GrammarValidateError Error;
201             Error.Index=Param.Index;
202             Error.RuleError=true;
203             Error.TokenError=false;
204             Error.Message=L"标识符\""+Obj->Name+L"\"不存在,不可用于组成文法推导式。";
205             Param.Structure->Errors.Add(Error);
206             return false;
207         }
208     }
209 
210     VBool GrammarValidate::Visit(GrammarRule* Obj , GrammarValidateParam Param)
211     {
212         if(Obj->Name==L"discard")
213         {
214             GrammarValidateError Error;
215             Error.Index=Param.Index;
216             Error.RuleError=true;
217             Error.TokenError=false;
218             Error.Message=L"\"discard\"只能被用来定义不需要的词法记号。";
219             Param.Structure->Errors.Add(Error);
220             return false;
221         }
222         Param.Structure->Rules[Obj->Name]->Add(Param.Index);
223         return Apply(Obj->Expression,Param);
224     }
225 
226     VBool GrammarValidate::Visit(LexicalDecl* Obj , GrammarValidateParam Param)
227     {
228         if(Obj->Name==L"program")
229         {
230             GrammarValidateError Error;
231             Error.Index=Param.Index;
232             Error.RuleError=false;
233             Error.TokenError=true;
234             Error.Message=L"\"program\"只能被用于定义文法初始符号。";
235             Param.Structure->Errors.Add(Error);
236             return false;
237         }
238         else if(Param.Structure->Tokens.Exists(Obj->Name))
239         {
240             GrammarValidateError Error;
241             Error.Index=Param.Index;
242             Error.RuleError=false;
243             Error.TokenError=true;
244             Error.Message=L"词法记号定义\""+Obj->Name+L"\"已经存在。";
245             Param.Structure->Errors.Add(Error);
246             return false;
247         }
248         else
249         {
250             {
251                 PWChar Buffer=new VWChar[Obj->RegularExpression.Length()];
252                 VInt Len=0;
253                 PWChar Read=Obj->RegularExpression.Buffer();
254                 Read++;
255                 while(true)
256                 {
257                     if(Read[0]==L'\'')
258                     {
259                         if(Read[1]==L'\'')
260                         {
261                             Buffer[Len++]=L'\'';
262                         }
263                         else
264                         {
265                             Buffer[Len]=L'\0';
266                             break;
267                         }
268                     }
269                     else
270                     {
271                         Buffer[Len++]=Read[0];
272                     }
273                     Read++;
274                 }
275                 Obj->ProcessedRegex=Buffer;
276                 delete[] Buffer;
277             }
278             VL_RegExpResult Result=RegularExpressionAnalysis(Obj->ProcessedRegex,false);
279             if(Result.Error)
280             {
281                 GrammarValidateError Error;
282                 Error.Index=Param.Index;
283                 Error.RuleError=false;
284                 Error.TokenError=true;
285                 Error.Message=L"位置:"+VUnicodeString(Result.ErrorPosition)+L",正则表达式错误:"+Result.ErrorMessage;
286                 Param.Structure->Errors.Add(Error);
287                 return false;
288             }
289             else
290             {
291                 Param.Structure->Tokens.Add(Obj->Name,Param.Index);
292                 return true;
293             }
294         }
295     }
296 
297     VBool GrammarValidate::Visit(GrammarDescription* Obj , GrammarValidateParam Param)
298     {
299         VBool Result=true;
300         for(VInt i=0;i<Obj->Tokens.GetCount();i++)
301         {
302             Param.Index=i;
303             Result=Apply(Obj->Tokens[i],Param) && Result;
304         }
305         for(VInt i=0;i<Obj->Rules.GetCount();i++)
306         {
307             VUnicodeString Name=Obj->Rules[i]->Name;
308             if(Param.Structure->Tokens.Exists(Name))
309             {
310                 GrammarValidateError Error;
311                 Error.Index=i;
312                 Error.RuleError=true;
313                 Error.TokenError=false;
314                 Error.Message=L"标识符\""+Name+L"\"已经被定义为词法记号,不可重定义为非终结符。";
315                 Param.Structure->Errors.Add(Error);
316                 return false;
317             }
318             if(!Param.Structure->Rules.Exists(Name))
319             {
320                 Param.Structure->Rules.Add(Name,new VL_List<VInt , true>);
321             }
322         }
323         for(VInt i=0;i<Obj->Rules.GetCount();i++)
324         {
325             Param.Index=i;
326             Result=Apply(Obj->Rules[i],Param) && Result;
327         }
328         if(!Param.Structure->Rules.Exists(L"program"))
329         {
330             GrammarValidateError Error;
331             Error.Index=-1;
332             Error.RuleError=false;
333             Error.TokenError=false;
334             Error.Message=L"找不到\"program\"非终结符,文法必须使用\"Program\"作为初始符号。";
335             Param.Structure->Errors.Add(Error);
336             return false;
337         }
338         return Result;
339     }
340 
341 /*********************************************************************************************************
342 GrammarSimulate
343 *********************************************************************************************************/
344 
345     VUnicodeString GrammarSimulatorNode::ToString(VUnicodeString Prefix)
346     {
347         if(SubExpressions.GetCount())
348         {
349             VUnicodeString Result;
350             Result=Prefix+TerminatorName+L" = {\r\n";
351             for(VInt i=0;i<SubExpressions.GetCount();i++)
352             {
353                 Result+=SubExpressions[i]->ToString(Prefix+L"  ");
354             }
355             Result+=Prefix+L"}\r\n";
356             return Result;
357         }
358         else
359         {
360             return Prefix+TerminatorName+L" = "+Value+L"\r\n";
361         }
362     }
363 
364     class GrammarSimulatorLexerHandler : public VL_LexerHandler
365     {
366     public:
367         GrammarSimulator* Simulator;
368 
369         GrammarSimulatorLexerHandler(GrammarSimulator* aSimulator):VL_LexerHandler(L"",false)
370         {
371             Simulator=aSimulator;
372         }
373 
374         void Handle(VL_FreeLexer<VLS_LexerTokenType>::InternalTokenData& TokenData)
375         {
376             Simulator->ParseErrors.Add(inner::Convert(TokenData,L"遇到无法分析的记号:\""+TokenData.Token+L"\""));
377         }
378     };
379 
380     class GrammarSimulatorHandler : public GrammarSimulator::RuleHandler
381     {
382     public:
383         GrammarSimulator* Simulator;
384         VL_LexerFactoryPtr Factory;
385         VUnicodeString Name;
386 
387         GrammarSimulatorHandler(GrammarSimulator* aSimulator , VUnicodeString aName , VL_SynRuleItem* RuleItem):GrammarSimulator::RuleHandler(RuleItem)
388         {
389             Simulator=aSimulator;
390             Name=aName;
391         }
392 
393         GrammarSimulatorNode::Ptr Handle(GrammarSimulator::RuleTransformer::IndexedData& Data)
394         {
395             GrammarSimulatorNode::Ptr Node=new GrammarSimulatorNode;
396             Node->TerminatorName=Name;
397             for(VInt i=0;i<Data.KeyCount();i++)
398             {
399                 GrammarSimulator::RuleTransformer::StoreType& Type=Data.ValueOfIndex(i);
400                 if(Type.Token)
401                 {
402                     GrammarSimulatorNode::Ptr Token=new GrammarSimulatorNode;
403                     Token->TerminatorName=Simulator->TokenIDs[Type.Token->TokenID()];
404                     Token->Value=Factory->GetDataOfStream(Type.Token).Token;
405                     Node->SubExpressions.Add(Token);
406                 }
407                 else
408                 {
409                     Node->SubExpressions.Add(Type.Data);
410                 }
411             }
412             return Node;
413         }
414     };
415 
416     GrammarSimulator::GrammarSimulator():Lexer(true)
417     {
418         LexerErrorHandler=new GrammarSimulatorLexerHandler(this);
419         Lexer.AddHandler(LexerErrorHandler.Object());
420     }
421 
422     VInt GrammarSimulator::Parse(VUnicodeString Input , VL_LexerFactoryPtr& LexerResult , GrammarSimulatorNode::List& SynerResult)
423     {
424         VL_SynMacInsListList Results;
425         VL_SynTokenErrorList Errors;
426         ParseErrors.Clear();
427 
428         LexerResult=Lexer.Parse(Input);
429         for(VInt i=0;i<Handlers.GetCount();i++)
430         {
431             dynamic_cast<GrammarSimulatorHandler*>(Handlers[i].Object())->Factory=LexerResult;
432         }
433         if(LexerResult)
434         {
435             Syner.Parse(LexerResult->GetSynTokenFactory(),Results,Errors);
436             for(VInt i=0;i<Errors.GetCount();i++)
437             {
438                 VL_FreeLexer<VLS_LexerTokenType>::InternalTokenData& TokenData=LexerResult->GetDataOfPosition(Errors[i].Position);
439                 if(&TokenData)
440                 {
441                     ParseErrors.Add(inner::Convert(TokenData,Errors[i].Message));
442                 }
443                 else
444                 {
445                     GrammarError Error;
446                     Error.LineInFile=-1;
447                     Error.PosInFile=-1;
448                     Error.PosInLine=-1;
449                     Error.Message=Errors[i].Message;
450                     ParseErrors.Add(Error);
451                 }
452             }
453             for(VInt i=0;i<Results.GetCount();i++)
454             {
455                 SynerResult.Add(Transformer.Run(Results[i].Object(),LexerResult->GetSynTokenFactory()));
456             }
457         }
458         for(VInt i=0;i<Handlers.GetCount();i++)
459         {
460             dynamic_cast<GrammarSimulatorHandler*>(Handlers[i].Object())->Factory=0;
461         }
462         return SynerResult.GetCount();
463     }
464 
465     VSynTerm GrammarSimulate::Visit(GrammarBranch* Obj , GrammarSimulateParam* Param)
466     {
467         VSynTerm Term=Apply(Obj->Expressions[0],Param);
468         for(VInt i=1;i<Obj->Expressions.GetCount();i++)
469         {
470             Term=Term | Apply(Obj->Expressions[i],Param);
471         }
472         return Term;
473     }
474 
475     VSynTerm GrammarSimulate::Visit(GrammarSequence* Obj , GrammarSimulateParam* Param)
476     {
477         VSynTerm Term=Apply(Obj->Expressions[0],Param);
478         for(VInt i=1;i<Obj->Expressions.GetCount();i++)
479         {
480             Term=Term + Apply(Obj->Expressions[i],Param);
481         }
482         return Term;
483     }
484 
485     VSynTerm GrammarSimulate::Visit(GrammarOptional* Obj , GrammarSimulateParam* Param)
486     {
487         return Opt(Apply(Obj->Expression,Param));
488     }
489 
490     VSynTerm GrammarSimulate::Visit(GrammarUnit* Obj , GrammarSimulateParam* Param)
491     {
492         return Param->Terms[Obj->Name][Param->UsedStorageIndex++];
493     }
494 
495     VSynTerm GrammarSimulate::Visit(GrammarRule* Obj , GrammarSimulateParam* Param)
496     {
497         Param->UsedStorageIndex=0;
498         VL_SynRuleItem* RuleItem=Param->Simulator->Syner.Infer(Param->Terms[Obj->Name])=Apply(Obj->Expression,Param);
499         GrammarSimulatorHandler* Handler=new GrammarSimulatorHandler(Param->Simulator.Object(),Obj->Name,RuleItem);
500         Param->Simulator->Handlers.Add(Handler);
501         Param->Simulator->Transformer.AddHandler(Handler);
502         return VSynTerm();
503     }
504 
505     VSynTerm GrammarSimulate::Visit(LexicalDecl* Obj , GrammarSimulateParam* Param)
506     {
507         VInt ID=Param->Simulator->Lexer.AddHandler(Obj->ProcessedRegex,(Obj->Name==L"discard"),Obj->Name);
508         Param->Simulator->TokenIDs.Add(ID,Obj->Name);
509         return Param->Simulator->Syner.Token(ID,Obj->Name);
510     }
511 
512     VSynTerm GrammarSimulate::Visit(GrammarDescription* Obj , GrammarSimulateParam* Param)
513     {
514         try
515         {
516             for(VInt i=0;i<Obj->Tokens.GetCount();i++)
517             {
518                 Param->Terms.Add(Obj->Tokens[i]->Name,Apply(Obj->Tokens[i],Param));
519             }
520             for(VInt i=0;i<Obj->Rules.GetCount();i++)
521             {
522                 VUnicodeString Name=Obj->Rules[i]->Name;
523                 if(!Param->Terms.Exists(Name))
524                 {
525                     Param->Terms.Add(Name,Param->Simulator->Syner.Rule(i,Name));
526                 }
527             }
528             for(VInt i=0;i<Obj->Rules.GetCount();i++)
529             {
530                 Apply(Obj->Rules[i],Param);
531             }
532             Param->Simulator->Lexer.Initialize();
533             Param->Simulator->Syner.Initialize(Param->Terms[L"program"]);
534             Param->Simulator->Transformer.Initialize(Param->Simulator->Syner.GetRuleItems());
535         }
536         catch(VL_SynError& Error)
537         {
538             Param->ErrorMessages=Error.Messages;
539         }
540         return VSynTerm();
541     }
542 }
posted on 2008-09-06 02:45 陈梓瀚(vczh) 阅读(2274) 评论(3)  编辑 收藏 引用 所属分类: 脚本技术

评论:
# re: 项目实验2:动态生成编译器前端 2008-09-06 06:37 | Lnn
超强!  回复  更多评论
  
# re: 项目实验2:动态生成编译器前端 2008-09-06 15:31 | 沈臻豪(foxtail)
这些就是你最近一直在写的demo?
  回复  更多评论
  
# re: 项目实验2:动态生成编译器前端 2012-12-30 06:07 | ArthasLee
超级代码兔又贴代码吓唬小盆与~_~  回复  更多评论
  

只有注册用户登录后才能发表评论。
网站导航: 博客园   IT新闻   BlogJava   知识库   博问   管理