词法分析部分写好了,不过扩展正则表达式部分懒的写,想先写语法分析器了。
用正则表达式的引擎提供了一套接口给词法分析器。测试了下
测试的内容如下,然后自己把一份代码粘帖了10几次,扩充到3MB,然后分析了下。大约44W标记每秒
    AddMoreType( scanner, L"ID",                    L"[_a-zA-Z][a-zA-Z0-9_]*"               );
    AddMoreType( scanner, L"ID.IF",                 L"if"                                   );
    AddMoreType( scanner, L"ID.BOOL",               L"true|false"                           );
    AddMoreType( scanner, L"ID.ELSE",               L"else"                                 );
    AddMoreType( scanner, L"ID.WHILE",              L"while"                                );
    AddMoreType( scanner, L"ID.DO",                 L"do"                                   );
    AddMoreType( scanner, L"ID.BREAK",              L"break"                                );
    AddMoreType( scanner, L"ID.CONTINUE",           L"continue"                             );
    AddMoreType( scanner, L"ID.FOR",                L"for"                                  );
    AddMoreType( scanner, L"OPERATOR",              L"\\+|\\-|\\*|/|%|<|>|=|<=|>=|==|!=|!|&&|\\|\\||\\^|;|\\(|\\)|\\{|\\}|,|\\[|\\]"                                  );
    AddMoreType( scanner, L"OPERATOR.NOT",          L"!"                                    );
    AddMoreType( scanner, L"OPERATOR.ADDMINUS",     L"\\+|\\-"                              );
    AddMoreType( scanner, L"OPERATOR.MULDIVMOD",    L"\\*|/|%"                              );
    AddMoreType( scanner, L"OPERATOR.COMPARE",      L"<|>|<=|>=|==|!="                      );
    AddMoreType( scanner, L"OPERATOR.ASSIGN",       L"="                                    );
    AddMoreType( scanner, L"OPERATOR.AND",          L"&&"                                   );
    AddMoreType( scanner, L"OPERATOR.OR",           L"\\|\\|"                               );
    AddMoreType( scanner, L"OPERATOR.XOR",          L"\\^"                                  );
    AddMoreType( scanner, L"OPERATOR.LEFT",         L"\\("                                  );
    AddMoreType( scanner, L"OPERATOR.RIGHT",        L"\\)"                                  );
    AddMoreType( scanner, L"OPERATOR.BEGIN",        L"\\{"                                  );
    AddMoreType( scanner, L"OPERATOR.END",          L"\\}"                                  );
    AddMoreType( scanner, L"OPERATOR.SPLITER",      L","                                    );
    AddMoreType( scanner, L"OPERATOR.ARRBEGIN",     L"\\["                                  );
    AddMoreType( scanner, L"OPERATOR.ARREND",       L"\\]"                                  );
    AddMoreType( scanner, L"OPERATOR.FINISH",       L";"                                    );
    AddMoreType( scanner, L"NUM",                   L"[0-9]+"                               );
    AddMoreType( scanner, L"REAL",                  L"([0-9]+\\.[0-9]*)|([0-9]*\\.[0-9]+)"  );
    AddMoreType( scanner, L"STRING",                L"\"([^\\\\\"]|\\\\\\.)*\""             );
    AddMoreType( scanner, L"COMMENT.discard",       L"#[^\\n]*"                             );  
回复  更多评论