如何使用两个单独的解析器解析下面两种类型的字符串 - 每种模式一个?
from pyparsing import *
dd = """
wire c_f_g;
wire cl_3_f_g4;
x_y abc_d
(.c_l (cl_dclk_001l),
.c_h (cl_m1dh_ff),
.ck (b_f_1g));
我能够使用下面的解析器独立解析它们:
# For the lines containing wire
printables_less_semicolon = printables.replace(';','')
wireDef = Literal("wire") + Word( printables)
# For the nested pattern
instanceStart = Word( printables ) + Word( printables_less_semicolon )
u = nestedExpr(opener="(", closer=")", ignoreExpr=dblSlashComment)
t = OneOrMore(instanceStart + u + Word( ";" ) + LineEnd())
print instanceStart.parseString(dd)
如果运行上面的代码,instanceStart
解析器会匹配有线。我怎样才能可靠地区分两者?
答案 0 :(得分:0)
我有一个有效的解决方案(绝对不是最好的)。
printables_less_semicolon = printables.replace(';','')
bracketStuff = Group(QuotedString("(", escChar=None, multiline=True, endQuoteChar=");"))
ifDef = Group(QuotedString("`ifdef", endQuoteChar="`endif", multiline=True))
theEnd = Word( "endmodule" )
nestedConns = Group(nestedExpr(opener="(", closer=")", ignoreExpr=dblSlashComment))
instance = Regex('[\s?|\r\n?].*\(')
othersWithSc = Group(Word (printables) + Word (printables_less_semicolon) + Literal(";"))
othersWithoutSc = Word (printables) + Word (printables_less_semicolon) + NotAny(Literal(";"))
上述解析器的组合允许我以我正在处理的格式解析文件。 输入示例:
ts2 = """
module storyOfFox ( andDog,
JLT);
input andDog;
output JLT;
`ifdef quickFox
`include "gatorade"
`include "chicken"
`endif
wire hello;
wire and;
wire welcome;
the quick
(.brown (fox),
.jumps (over),
.the (lazy),
.dog (and),
.the (dog),
.didNot (likeIt));
theDog thenWent
(// Waiver unused
.on (),
// Waiver unused
.to (),
.sueThe (foxFor),
.jumping (andBeingTooQuick),
.TheDog (wasHailedAsAHero),
.endOf (Story));
endmodule
"""
用于解析上述内容的解析器:
try:
tp = othersWithoutSc + Optional(bracketStuff) + Optional(ZeroOrMore(othersWithSc)) + Optional( Group( ZeroOrMore( othersWithoutSc + nestedConns ) ) ) + theEnd
tpI = Group( ZeroOrMore( othersWithoutSc + nestedConns + Word( ";" ) ) )
tpO = Each( [Optional(ZeroOrMore(othersWithSc)), Optional(ifDef)] )
tp = othersWithoutSc + Optional(bracketStuff) + tpO + Group(tpI) + theEnd
#print othersWithoutSc.parseString("input xyz;")
print tp.parseString(ts2)
except ParseException as x:
print "Line {e.lineno}, column {e.col}:\n'{e.line}'".format(e=x)
获得的输出:
module
storyOfFox
[' andDog, \n JLT']
['input', 'andDog', ';']
['output', 'JLT', ';']
[' quickFox\n `include "gatorade"\n `include "chicken" \n']
['wire', 'hello', ';']
['wire', 'and', ';']
['wire', 'welcome', ';']
[['the', 'quick', [['.brown', ['fox'], ',', '.jumps', ['over'], ',', '.the', ['lazy'], ',', '.dog', ['and'], ',', '.the', ['dog'], ',', '.didNot', ['likeIt']]], ';', 'theDog', 'thenWent', [['// Waiver unused', '.on', [], ',', '// Waiver unused', '.to', [], ',', '.sueThe', ['foxFor'], ',', '.jumping', ['andBeingTooQuick'], ',', '.TheDog', ['wasHailedAsAHero'], ',', '.endOf', ['Story']]], ';']]
endmodule
我不想接受这个答案,因为我可能没有解决我之前遇到的真正问题。我刚刚找到了解决它的方法并获得了我需要的输出。