我想在类似C的源代码(GLSL代码)中使用 pyparsing 解析声明,以便获得(类型,名称,值)的列表。
例如:
int a[3];
int b=1, c=2.0;
float d = f(z[2], 2) + 3*g(4,a), e;
Point f = {1,2};
我想获得类似的东西:
[ ('int', 'a[3]', ''),
('int', 'b', '1'),
('int', 'c', '2.0'),
('float', 'd', 'f(z[2], 2) + 3*g(4,a)'),
('float', 'e', ''),
('Point', 'f', '{1,2}') ]
我使用Forward()
和operatorPrecedence()
来尝试解析rhs
表达式,但我怀疑在我的情况下没有必要。
到目前为止,我有:
IDENTIFIER = Regex('[a-zA-Z_][a-zA-Z_0-9]*')
INTEGER = Regex('([+-]?(([1-9][0-9]*)|0+))')
EQUAL = Literal("=").suppress()
SEMI = Literal(";").suppress()
SIZE = INTEGER | IDENTIFIER
VARNAME = IDENTIFIER
TYPENAME = IDENTIFIER
VARIABLE = Group(VARNAME.setResultsName("name")
+ Optional(EQUAL + Regex("[^,;]*").setResultsName("value")))
VARIABLES = delimitedList(VARIABLE.setResultsName("variable",listAllMatches=True))
DECLARATION = (TYPENAME.setResultsName("type")
+ VARIABLES.setResultsName("variables", listAllMatches=True) + SEMI)
code = """
float a=1, b=3+f(2), c;
float d=1.0, e;
float f = z(3,4);
"""
for (token, start, end) in DECLARATION.scanString(code):
for variable in token.variable:
print token.type, variable.name, variable.value
但由于f=z(3,4)
而未解析最后一个表达式(,
)。
答案 0 :(得分:0)
C struct parser上有一个pyparsing wiki可能会给你一个良好的开端。
答案 1 :(得分:0)
这似乎有效。
IDENTIFIER = Word(alphas+"_", alphas+nums+"_" )
INT_DECIMAL = Regex('([+-]?(([1-9][0-9]*)|0+))')
INT_OCTAL = Regex('(0[0-7]*)')
INT_HEXADECIMAL = Regex('(0[xX][0-9a-fA-F]*)')
INTEGER = INT_HEXADECIMAL | INT_OCTAL | INT_DECIMAL
FLOAT = Regex('[+-]?(((\d+\.\d*)|(\d*\.\d+))([eE][-+]?\d+)?)|(\d*[eE][+-]?\d+)')
LPAREN, RPAREN = Literal("(").suppress(), Literal(")").suppress()
LBRACK, RBRACK = Literal("[").suppress(), Literal("]").suppress()
LBRACE, RBRACE = Literal("{").suppress(), Literal("}").suppress()
SEMICOLON, COMMA = Literal(";").suppress(), Literal(",").suppress()
EQUAL = Literal("=").suppress()
SIZE = INTEGER | IDENTIFIER
VARNAME = IDENTIFIER
TYPENAME = IDENTIFIER
OPERATOR = oneOf("+ - * / [ ] . & ^ ! { }")
PART = nestedExpr() | nestedExpr('{','}') | IDENTIFIER | INTEGER | FLOAT | OPERATOR
EXPR = delimitedList(PART, delim=Empty()).setParseAction(keepOriginalText)
VARIABLE = (VARNAME("name") + Optional(LBRACK + SIZE + RBRACK)("size")
+ Optional(EQUAL + EXPR)("value"))
VARIABLES = delimitedList(VARIABLE.setResultsName("variables",listAllMatches=True))
DECLARATION = (TYPENAME("type") + VARIABLES + SEMICOLON)
code = """
int a[3];
int b=1, c=2.0;
float d = f(z[2], 2) + 3*g(4,a), e;
Point f = {1,2};
"""
for (token, start, end) in DECLARATION.scanString(code):
vtype = token.type
for variable in token.variables:
name = variable.name
size = variable.size
value = variable.value
s = "%s / %s" % (vtype,name)
if size: s += ' [%s]' % size[0]
if value: s += ' / %s' % value[0]
s += ";"
print s