Pyparsing,使用嵌套解析器解析php函数注释块的内容

时间:2012-02-22 17:27:22

标签: php python nested pyparsing post-processing

AKA“将根据Parser.parseAction的结果构造的子节点添加到父解析树”

我正在尝试使用PyParsing(规则恕我直言)解析PHP文件,其中函数定义已使用JavaDoc样式注释进行注释。原因是我想以一种可用于生成客户端存根代码的方式存储类型信息。

例如:

/*
*  @vo{$user=UserAccount}
*/
public function blah($user){ ......

现在,我已经能够编写一个解析器,使用PyParser非常容易。但是,PyParser带有一个内置的javaStyleComment Token,我想重用它。所以我解析了代码,然后尝试附加一个parseAction,它将剥离gunk并运行一个子解析器(抱歉,不确定术语)并将结果附加到父解析树。

我无法弄明白该怎么做。代码附在下面。顺便说一句,我可以轻松编写自己的javaStyleComment,但我想知道一般是否可以链接解析结果?

再次,抱歉,如果我的问题不简洁,我只是新手。

#@PydevCodeAnalysisIgnore
from pyparsing import delimitedList,Literal,Keyword,Regex,ZeroOrMore,Suppress,Optional,QuotedString,Word,hexnums,alphas,\
    dblQuotedString,FollowedBy, sglQuotedString,oneOf,Group
import pyparsing

digits = "0123456789"
colon = Literal(':')
semi = Literal(';')
period = Literal('.')
comma = Literal(',')
lparen = Literal('{')
rparen = Literal('}')
lbracket = Literal('(')
rbracket = Literal(')')
number = Word(digits)
hexint = Word(hexnums,exact=2)
text = Word(alphas)

php = Literal("<?php") + Literal("echo") + Literal("?>")
print php.parseString("""<?php echo ?>""")

funcPerm = oneOf("public private protected")

print funcPerm.parseString("""public""")
print funcPerm.parseString("""private""")
print funcPerm.parseString("""protected""")

stdParam = Regex(r"\$[a-z][a-zA-Z0-9]*")
print stdParam.parseString("""$dog""")

dblQuotedString.setParseAction(lambda t:t[0][1:-1])
sglQuotedString.setParseAction(lambda t:t[0][1:-1])
defaultParam = Group(stdParam + Literal("=") + ( dblQuotedString | sglQuotedString | number))  
print defaultParam.parseString(""" $dave = 'dog' """)

param = ( defaultParam | stdParam )
print param.parseString("""$dave""")

#print param.parseString("""dave""")
print param.parseString(""" $dave = 'dog' """)
print param.parseString(""" $dave = "dog" """)

csl = Optional(param  + ZeroOrMore( Suppress( "," ) + param))
print csl.parseString("""$dog,$cat,$moose     """)
print csl.parseString("""$dog,$cat,$moose = "denny"     """)
print csl.parseString("""""")
#
funcIdent = Regex(r"[a-z][_a-zA-Z0-9]*")
funcIdent.parseString("farb_asdfdfsDDDDDDD")
#
funcStart = Group(funcPerm + Literal("function") + funcIdent)
print funcStart.parseString("private function dave")
#
#
litWordlit = Literal("(") +  csl + Literal(")")
print litWordlit.parseString("""( )""")

funcDef = funcStart + Literal("(") + Group(csl)  + Literal(")")
#funcDef.Name = "FUNCTION"
#funcDef.ParseAction = lambda t: (("found %s") % t)
print funcDef.parseString("""private function doggy($bow,$sddfs)""")

funcDefPopulated = funcStart + Literal("(") + Group(csl)  + Literal(")") + Group(Literal("{")  +  ZeroOrMore(pyparsing.CharsNotIn("}"))  +Literal("}")) 
#funcDef.Name = "FUNCTION"
#funcDef.ParseAction = lambda t: (("found %s") % t)
print funcDefPopulated.parseString("""private function doggy($bow,$sddfs){ $dog="dave" }""")

#" @vo{$bow=BowVo}"
docAnnotations = ZeroOrMore( Group( Literal("@") + text + Suppress(lparen) + param + Literal("=") + text  + Suppress(rparen ) ))
print docAnnotations.parseString(""" @vo{$bow=BowVo}""")

def extractDoco(s,l,t):
    """ Helper parse action for parsing the content of a comment block
    """
    ret = t[0]
    ret = ret.replace('/**','')
    ret = ret.replace('*\n','')
    ret = ret.replace('*\n','\n')
    ret = ret.replace('*/','')
    t = docAnnotations.parseString(ret)
    return  t

phpCustomComment = pyparsing.javaStyleComment

#Can't figure out what to do here. Help !!!!!
phpCustomComment.addParseAction(extractDoco)

commentedFuncDef  =  phpCustomComment + funcDefPopulated
print commentedFuncDef.parseString(
                                   """
                                   /**
                                   * @vo{$bow=BowVo}
                                   * @vo{$sddfs=UserAccount}
                                   */
                                   private function doggy($bow,$sddfs){ $dog="dave" }"""
                                   )


*emphasized text*





#example = open("./example.php","r")
#funcDef.parseFile(example)
#f4.parseString("""private function dave ( $bow )""")
#funcDef = funcPerm + Keyword("function") + funcName + Literal("(")  +  csl  + Literal(")")  
#print funcDef.parseString(""" private function doggy($bow)""")

===更新

我发现ParseResults例如有一个方法insert,它允许你扩充解析树,但仍然无法弄清楚如何动态地做。

例如:

title = oneOf("Mr Miss Sir Dr Madame")
aname = title + Group(Word(alphas) + Word(alphas))
res=aname.parseString("Mr Dave Young")
res
(['Mr', (['Dave', 'Young'], {})], {})

res.insert(3,3)

res
(['Mr', (['Dave', 'Young'], {}), 3], {})

1 个答案:

答案 0 :(得分:2)

首先,我恋爱了。 PyParser必须是我用过的最好的库之一。 其次,解决方案非常非常简单。

以下是我修复它的方法:

docAnnotations = ZeroOrMore( Group( ZeroOrMore(Suppress("*")) +   Suppress(Literal("@")) + Suppress(Literal("vo")) + Suppress(lparen) + param + Literal("=") + text  + Suppress(rparen ) ))
print docAnnotations.parseString(""" @vo{$bow=BowVo}""")

def extractDoco(t):
    """ Helper parse action for parsing the content of a comment block
    """
    ret = t[0]
    ret = ret.replace('/**','')
    ret = ret.replace('*\n','')
    ret = ret.replace('*\n','\n')
    ret = ret.replace('*/','')
    print ret
    return docAnnotations.parseString(ret)  

phpCustomComment = pyparsing.javaStyleComment

最后一节:

print commentedFuncDef.parseString(
                                   """
                                   /**
                                   * @vo{$bow=BowVo}
                                   * @vo{$sddfs=UserAccount}
                                   */
                                   private function doggyWithCustomComment($bow,$sddfs){ $dog="dave" }"""
                                   )

结果:

[['$bow', '=', 'BowVo'], ['$sddfs', '=', 'UserAccount'], ['private', 'function', 'doggyWithCustomComment'], '(', ['$bow', '$sddfs'], ')', ['{', ' $dog="dave" ', '}']]