我正在使用Ply lex / yaac生成器为Python编写一个小编译器,用于编译器类。现在我们正在进入语义分析阶段,很快就会进行代码生成。但是我的语法还没有完成,因为我无法为函数调用编写规则,因为我到目前为止尝试的所有内容都以转换/减少冲突结束,这使规则被忽略。
有以下规则:function_call : ID function_call_args
和
function_call_args : expression
| function_call
:
program : function_definition program
| class_definition program
| function_call
| empty
当我添加规则expression : function_call
时,结果是模棱两可的并且给出了转移/减少冲突。我希望我的语法中的function_call可以是一个表达式,然后一个人就可以编写一些代码,比如a + b + f(2)* g(a / x) - h()。
我是否添加空作为function_call_args或logical_expression的替代,它也变得模棱两可。或者我是否只是告诉我返回一个function_call作为替代方法的return语句也可以减少冲突。后者并不重要,如果function_call可以成为表达式的替代,那么能够在函数的任何位置调用函数并且即使返回函数调用也足够了。
Bellow遵循词法分析器,解析器和完整语法的完整代码:
全文法
Rule 0 S' -> program
Rule 1 program -> function_definition program
Rule 2 program -> class_definition program
Rule 3 program -> function_call
Rule 4 program -> empty
Rule 5 function_call -> ID function_call_args
Rule 6 function_call_args -> expression
Rule 7 function_call_args -> function_call
Rule 8 class_definition -> DEFINE CLASS ID LPAREN RPAREN LBRACE program RBRACE
Rule 9 function_definition -> DEFINE data_type ID LPAREN function_definition_args RPAREN block
Rule 10 function_definition_args -> var_list
Rule 11 function_definition_args -> empty
Rule 12 var_list -> var_declaration COMMA var_list
Rule 13 var_list -> var_declaration
Rule 14 var_declaration -> data_type ID
Rule 15 data_type -> VOID
Rule 16 data_type -> INT
Rule 17 data_type -> FLOAT
Rule 18 data_type -> STRING
Rule 19 data_type -> CHAR
Rule 20 block -> LBRACE statements RBRACE
Rule 21 statements -> statement statements
Rule 22 statements -> empty
Rule 23 statement -> matched_statement
Rule 24 statement -> unmatched_statement
Rule 25 matched_statement -> IF LPAREN logical_expression RPAREN matched_statement ELSE matched_statement
Rule 26 matched_statement -> WHILE LPAREN logical_expression RPAREN matched_statement
Rule 27 matched_statement -> FOR LPAREN var_declaration SEMI_COLON logical_expression SEMI_COLON attribution RPAREN matched_statement
Rule 28 matched_statement -> return_statement
Rule 29 matched_statement -> compound_statement
Rule 30 matched_statement -> simple_statement
Rule 31 matched_statement -> block
Rule 32 unmatched_statement -> IF LPAREN logical_expression RPAREN matched_statement
Rule 33 unmatched_statement -> IF LPAREN logical_expression RPAREN matched_statement ELSE unmatched_statement
Rule 34 logical_expression -> logical_term OR logical_term
Rule 35 logical_expression -> logical_term
Rule 36 logical_term -> logical_factor AND logical_factor
Rule 37 logical_term -> logical_factor
Rule 38 logical_factor -> boolean_statement
Rule 39 logical_factor -> NOT logical_factor
Rule 40 logical_factor -> LPAREN logical_expression RPAREN
Rule 41 boolean_statement -> value comparison_op value
Rule 42 boolean_statement -> value
Rule 43 value -> ID
Rule 44 value -> number
Rule 45 number -> int
Rule 46 number -> float
Rule 47 int -> INTCONST
Rule 48 float -> FLOATCONST
Rule 49 comparison_op -> LESS_THAN
Rule 50 comparison_op -> LESS_EQUAL
Rule 51 comparison_op -> GREATER_THAN
Rule 52 comparison_op -> GREATER_EQUAL
Rule 53 comparison_op -> EQUAL
Rule 54 comparison_op -> NOT_EQUAL
Rule 55 compound_statement -> var_list
Rule 56 compound_statement -> attribution
Rule 57 simple_statement -> expression
Rule 58 attribution -> var_declaration EQUALS expression
Rule 59 attribution -> var_declaration EQUALS string
Rule 60 attribution -> var_declaration EQUALS char
Rule 61 attribution -> ID EQUALS expression
Rule 62 attribution -> ID EQUALS STRINGCONST
Rule 63 attribution -> ID EQUALS CHARCONST
Rule 64 string -> STRINGCONST
Rule 65 char -> CHARCONST
Rule 66 expression -> expression PLUS term
Rule 67 expression -> expression MINUS term
Rule 68 expression -> term
Rule 69 term -> term TIMES factor
Rule 70 term -> term DIVIDE factor
Rule 71 term -> term MOD factor
Rule 72 term -> factor
Rule 73 factor -> number
Rule 74 factor -> ID
Rule 75 factor -> factor_expr
Rule 76 factor_expr -> LPAREN expression RPAREN
Rule 77 uminus_expression -> MINUS LPAREN expression RPAREN
Rule 78 empty -> <empty>
Rule 79 return_statement -> RETURN expression
Rule 80 function_definition -> DEFINE data_type ID LPAREN function_definition_args RPAREN error
词法:
import ply.lex as lex
import ply.yacc as yacc
# Reserved words
reserved = {
'if' : 'IF',
'else' : 'ELSE',
'while' : 'WHILE',
'for' : 'FOR',
'switch' : 'SWITCH',
'case' : 'CASE',
'class' : 'CLASS',
'define' : 'DEFINE',
'int' : 'INT',
'float' : 'FLOAT',
'char' : 'CHAR',
'string' : 'STRING',
'void' : 'VOID',
'equal' : 'EQUAL',
'and' : 'AND',
'or' : 'OR',
'not' : 'NOT',
'do' : 'DO',
'return' : 'RETURN',
}
# The tokens declaration is made here.
tokens = [
# Literals (identifier, integer constant, float constant, string constant,
# char const)
# TODO add constants' Types
'ID',
#'NUMBER',
# Operators +,-,*,/,%
'PLUS',
'MINUS',
'TIMES',
'DIVIDE',
'MOD',
# Logical Operators
'LESS_THAN',
'LESS_EQUAL',
'GREATER_THAN',
'GREATER_EQUAL',
'NOT_EQUAL',
# Delimeters such as (),{},[],:
'LPAREN',
'RPAREN',
'LBRACE',
'RBRACE',
'COLON',
'COMMA',
'SEMI_COLON',
#Assignment Operators
'EQUALS',
# Integer literal
'INTCONST',
# Floating literal
'FLOATCONST',
# String literal
'STRINGCONST',
# Character constant 'c' or L'c'
'CHARCONST'
]
tokens += list(reserved.values())
# Regular expression rules for simple tokens
#t_NUMBER = r'\d+'
# Operators
t_PLUS = r'\+'
t_MINUS = r'-'
t_TIMES = r'\*'
t_DIVIDE = r'/'
t_MOD = r'\%'
# Logical Operators
t_LESS_THAN = r'<'
t_LESS_EQUAL = r'<='
t_GREATER_THAN = r'>'
t_GREATER_EQUAL = r'>='
t_NOT_EQUAL = r'!='
# Delimiters
t_LPAREN = r'\('
t_RPAREN = r'\)'
t_COLON = r'\:'
t_LBRACE = r'\{'
t_RBRACE = r'\}'
t_COMMA = r','
t_SEMI_COLON = R';'
# Assignment
t_EQUALS = r'='
# Integer literal
t_INTCONST = r'\d+'
# Floating literal
t_FLOATCONST = r'\d+[.]\d+'
# String literal
t_STRINGCONST = r'\"([^\\\n]|(\\.))*?\"'
# Character constant 'c' or L'c'
t_CHARCONST = r'\'([^\\\n]|(\\.))*?\''
# Rule for identificator
def t_ID(t):
r'[a-zA-Z_][a-zA-Z0-9_]*'
t.type = reserved.get(t.value,'ID') # Check for reserved words
return t
# Define a rule so we can track line numbers
def t_newline(t):
r'\n+'
t.lexer.lineno += len(t.value)
# Compute column.
# input is the input text string
# token is a token instance
def find_column(input, token):
line_start = input.rfind('\n', 0, token.lexpos) + 1
return (token.lexpos - line_start) + 1
# A string containing ignored characters (spaces and tabs)
t_ignore = ' \t'
# Error handling rule
def t_error(t):
print("Illegal character '%s'" % t.value[0])
t.lexer.skip(1)
# One line comments Python alike
def t_comment(t):
r"[ ]*\043[^\n]*" # \043 is '#'
pass
# Build the lexer
#lexer = lex.lex()
lex.lex()
if __name__ == '__main__':
lex.runmain()
分析器:
import sys
sys.path.append("..")
import ply.yacc as yacc
from lexical_analyzer import tokens
from ast import *
from semantical_analyser.semantical_analyzer import SymbolTable, Var, Scope
from semantical_analyser.stack import Stack
symbolTable = SymbolTable()
stack = Stack()
stack.push(Scope())
precedence = (
('nonassoc', 'LESS_THAN', 'GREATER_THAN'), # Nonassociative operators
('nonassoc', 'LESS_EQUAL', 'GREATER_EQUAL'), # Nonassociative operators
('left', 'PLUS', 'MINUS'),
('left', 'TIMES', 'DIVIDE', 'MOD'),
('right', 'UMINUS'), # Unary minus operator
)
# Initial Rule
def p_program(p):
'''program : function_definition program
| class_definition program
| function_call
| empty'''
if len(p) == 3:
p[0] = Program(p[1], p[2])
else:
p[0] = p[1]
stack.push(Scope())
pass
#def p_function_call_expr(p):
# '''expression : function_call'''
def p_function_call(p):
'''function_call : ID LPAREN function_call_args RPAREN'''
#p[0] = Node("FUNCTION_CALL", [p[1], p[2], p[3]])
p[0] = FunctionCall(p[1], p[2])
pass
def p_function_call_args(p):
'''function_call_args : expression
| function_call'''
#p[0] = Node("FUNCTION_CALL_ARGS", [p[1]])
p[0] = p[1]
pass
def p_class_definition(p):
''' class_definition : DEFINE CLASS ID LPAREN RPAREN LBRACE program RBRACE'''
#p[0] = ClassDefinition("CLASS_DEFINITION", [p[1], p[2], p[3], p[4], p[5], p[6]], p[7], p[8]])
p[0] = ClassDefinition(p[3], p[7])
pass
# Function definition
def p_function_definition(p):
'''function_definition : DEFINE data_type ID LPAREN function_definition_args RPAREN block'''
#p[0] = Node("FUNCTION_DEFINITION", [p[1], p[2], p[3], p[4], p[5], p[6], p[7]])
p[0] = FunctionDefinition(p[2], p[3], p[5], p[7])
pass
# Arguments that goes inside a function definition
def p_function_definition_args(p):
'''function_definition_args : var_list
| empty'''
#p[0] = Node("FUNCTION_DEFINITION_ARGS", [p[1]])
p[0] = p[1]
pass
# list of declaration of variables or a single variable
def p_var_list(p):
'''var_list : var_declaration COMMA var_list
| var_declaration'''
if len(p) >= 3:
p[0] = VarList(p[1], p[3])
else:
p[0] = p[1]
pass
# declaration of var
def p_var_declaration(p):
'var_declaration : data_type ID'
p[0] = VarDeclaration(p[1], p[2])
var = Var("ID", p[2], p[1])
symbolTable.addVar(var)
topOfStack = stack.peek()
topOfStack.addObj(var)
pass
# data types for variables
def p_data_type(p):
'''data_type : VOID
| INT
| FLOAT
| STRING
| CHAR'''
p[0] = DataType(p[1])
pass
# block - a block is enclosed by braces {stmt}
def p_block(p):
'block : LBRACE statements RBRACE'
#p[0] = Node("BLOCK", [p[1], p[2], p[3]])
p[0] = Block(p[2])
#stack.push(Scope())
pass
def p_statements(p):
'''statements : statement statements
| empty'''
if len(p) >= 3:
p[0] = Statements(p[1], p[2])
else:
p[0] = Node("EMPTY", [])
pass
# statement is either a compound statement or a simple statement
def p_statement(p):
'''statement : matched_statement
| unmatched_statement'''
#p[0] = Node("STATEMENT", [p[1]])
p[0] = p[1]
pass
# Matched Statement - This is used to deal with If ambiguity
def p_matched_statement(p):
''' matched_statement : IF LPAREN logical_expression RPAREN matched_statement ELSE matched_statement
| WHILE LPAREN logical_expression RPAREN matched_statement
| FOR LPAREN var_declaration SEMI_COLON logical_expression SEMI_COLON attribution RPAREN matched_statement
| return_statement
| compound_statement
| simple_statement
| block'''
if len(p) == 9:
#p[0] = Node("FOR-CLAUSE", [p[1], p[2], p[3], p[4], p[5], p[6], p[7], p[8], p[9]])
p[0] = ForStatement(p[3], p[5], p[7], p[9])
elif len(p) == 8:
#p[0] = Node("IF-ELSE-CLAUSE", [p[1], p[2], p[3], p[4], p[5], p[6], p[7]])
p[0] = IfElseStatement(p[3], p[5], p[7])
elif len(p) == 6:
#p[0] = Node("WHILE-CLAUSE", [p[1], p[2], p[3], p[4], p[5]])
p[0] = WhileStatement(p[3], p[5])
else:
#p[0] = Node("MATCHED_STATEMENT", [p[1]])
p[0] = p[1]
pass
# Unmateched statement - This is used to deal with If ambiguity
def p_unmatched_statement(p):
''' unmatched_statement : IF LPAREN logical_expression RPAREN matched_statement
| IF LPAREN logical_expression RPAREN matched_statement ELSE unmatched_statement '''
if len(p) == 7:
#p[0] = Node("IF-ELSE-CLAUSE", [p[1], p[2], p[3], p[4], p[5], p[6]])
p[0] = IfElseStatement(p[3], p[5], p[6])
else:
# p[0] = Node("IF-CLAUSE", [p[1], p[2], p[3], p[4], p[5]])
p[0] = IfStatement(p[3], p[5])
#stack.push(Scope())
#currentScope = stack.peek()
#newScope = Scope()
#currentScope.addObj(newScope)
pass
# Logical expression OR
def p_logical_expression_or(p):
'logical_expression : logical_term OR logical_term'
p[0] = Node("LOGICAL_EXPRESSION", [p[1], p[2]], p[3])
pass
# Logical expression term
def p_logical_expression_term(p):
'logical_expression : logical_term'
p[0] = Node("LOGICAL_EXPRESSION_TERM", [p[1]])
pass
# Logical expression AND
def p_logical_term_and(p):
'logical_term : logical_factor AND logical_factor'
#p[0] = Node("LOGICAL_EXPRESSION_AND", [p[1]])
p[0] = BinaryOp(p[1], p[2], p[3])
pass
# Logical expression factor
def p_logical_term_factor(p):
'logical_term : logical_factor'
#p[0] = Node("LOGICAL_EXPRESSION_FACTOR", [p[1]])
p[0] = p[1]
pass
# Logical expression NOT, boolean or another logical expression enclosed by parens
def p_logical_factor(p):
'''logical_factor : boolean_statement
| NOT logical_factor
| LPAREN logical_expression RPAREN'''
if len(p) == 4:
#p[0] = Node("LOGICAL_EXPRESSION_FACTOR", [p[1], p[2], p[3]])
p[0] = BinaryOp(p[1], p[2], p[3])
elif len(p) == 3:
#p[0] = Node("LOGICAL_EXPRESSION_FACTOR", [p[1], p[2]])
p[0] = NotExpression(p[2])
else:
#p[0] = Node("LOGICAL_EXPRESSION_FACTOR", [p[1]])
p[0] = p[1]
pass
# boolean_statement
def p_boolean_statement(p):
'''boolean_statement : value comparison_op value
| value'''
if len(p) == 4:
#p[0] = Node("BOOLEAN_STATEMENT", [p[1], p[3]], p[2])
p[0] = BinaryOp(p[1], p[2], p[3])
else:
#p[0] = Node("BOOLEAN_STATEMENT", [p[1]])
p[0] = p[1]
pass
def p_value(p):
''' value : ID
| number'''
#p[0] = Node("ID_OR_NUMBER", [p[1]])
p[0] = p[1]
pass
def p_number(p):
'''number : int
| float'''
p[0] = Number(p[1])
pass
def p_int(p):
'int : INTCONST'
p[0] = Const('int', p[1])
pass
def p_float(p):
'float : FLOATCONST'
p[0] = Const('float', p[1])
pass
def p_comparison_op(p):
'''comparison_op : LESS_THAN
| LESS_EQUAL
| GREATER_THAN
| GREATER_EQUAL
| EQUAL
| NOT_EQUAL'''
#p[0] = Node("COMPARISON_OP", [p[1]])
p[0] = p[1]
pass
# Compound statement is a while or for or do while or switch case
def p_compound_statment(p):
'''compound_statement : var_list
| attribution'''
#p[0] = Node("COMPOUND_STATEMENT", [p[1]])
p[0] = p[1]
pass
# A simple statement is a declaration, or
def p_simple_statement(p):
'''simple_statement : expression'''
#p[0] = Node("SIMPLE_STATEMENT", [p[1]])
p[0] = p[1]
pass
# Production for atrribution
def p_atrribution(p):
'''attribution : var_declaration EQUALS expression
| var_declaration EQUALS string
| var_declaration EQUALS char
| ID EQUALS expression
| ID EQUALS STRINGCONST
| ID EQUALS CHARCONST'''
#p[0] = Node("ATTRIBUTION", [p[1], p[3]], p[2])
#p[0] = BinaryOp(p[1], p[2], p[3])
p[0] = Attribution(p[1], p[3])
pass
def p_string(p):
'string : STRINGCONST'
p[0] = Const('string', p[1])
pass
def p_char(p):
'char : CHARCONST'
p[0] = Const('char', p[1])
pass
# Gramatical rules for expressions
def p_expression_plus(p):
'expression : expression PLUS term'
#p[0] = Node("binop_plus", [p[1],p[3]], p[2])
p[0] = BinaryOp(p[1], p[2], p[3])
pass
def p_expression_minus(p):
'expression : expression MINUS term'
#p[0] = Node("BINOP_MINUS", [p[1], p[3]], p[2])
p[0] = BinaryOp(p[1], p[2], p[3])
pass
def p_expression_term(p):
'expression : term'
#p[0] = Node("BINOP_TERM", [p[1]])
p[0] = p[1]
pass
def p_term_times(p):
'term : term TIMES factor'
#p[0] = Node("BINOP_TERM_TIMES", [p[1], p[3]], p[2])
p[0] = BinaryOp(p[1], p[2], p[3])
pass
def p_term_div(p):
'term : term DIVIDE factor'
#p[0] = Node("BINOP_DIVIDE", [p[1], p[3]], p[2])
p[0] = BinaryOp(p[1], p[2], p[3])
pass
def p_term_mod(p):
'term : term MOD factor'
#p[0] = Node("BINOP_MOD", [p[1], p[3]], p[2])
p[0] = BinaryOp(p[1], p[2], p[3])
pass
def p_term_factor(p):
'term : factor'
#p[0] = Node("BINOP_FACTOR", [p[1]])
p[0] = p[1]
pass
def p_factor_num(p):
'''factor : number
| ID
| factor_expr'''
#p[0] = Node("BINOP_FACTOR", [p[1]])
p[0] = p[1]
pass
def p_factor_expr(p):
'factor_expr : LPAREN expression RPAREN'
#p[0] = Node("BINOP_EXPR", [p[1], p[3]], p[2])
p[0] = p[1]
pass
def p_uminus_expr(p):
'uminus_expression : MINUS LPAREN expression RPAREN %prec UMINUS'
pass
# The empty production
def p_empty(p):
'empty :'
p[0] = None #Node("EMPTY", [])
pass
# Return statement1
def p_return_statement(p):
'return_statement : RETURN expression'
#p[0] = Node("RETURN_STATEMENT", [p[1], p[2]])
p[0] = ReturnStatement(p[2])
pass
# Error rule for syntax errors
def p_error(p):
if p:
print("Syntax error at token", p.type)
print("Syntax error at '%s'" % p.value)
print("line : '%s'" % p.lineno)
print("column: '%s'" % p.lexpos)
#print("Syntax error in input!")
#parser.errok()
else:
print("Syntax error at EOF")
def p_function_definition_error(p):
'''function_definition : DEFINE data_type ID LPAREN function_definition_args RPAREN error'''
print("Syntax error in function statement at '%s'" % p)
#print('Line' + str(p.linespan(1)))
parser = yacc.yacc()
from read_file_into_buffer import readFileIntoBuffer
data = readFileIntoBuffer('test.fpl')
ast = parser.parse(data, tracking=True)
print(ast)
答案 0 :(得分:0)
假设您的语言是C语言,f 2
不是有效的函数调用;必须写成f(2)
。但是yiur语法说:
Rule 5 function_call -> ID function_call_args
Rule 6 function_call_args -> expression
当然允许f 2
(ID
后跟expression
)。另一方面,该语法不允许f()
也不允许f(1,2)
。
(第三个作品function_call_args -> function_call
是多余的,因为你有function_call_args -> expression
和expression -> function_call
。冗余制作会产生解析冲突,所以我把它遗漏了。)
您需要的function_call
与function_definition
非常相似,具有明确的括号和表达式的可选列表。