我需要使用Python解析类似下面的文本文件,构建数据的分层对象结构然后进行处理。这与我们可以使用xml.etree.ElementTree和其他XML解析器非常相似。
然而,这些文件的语法不是XML,我想知道实现这样一个解析器的最佳方法是什么:如果尝试子类化一个XML解析器(哪一个?)并自定义其标记识别行为,写一个自定义解析器等。
{NETLIST topblock
{VERSION 2 0 0}
{CELL topblock
{PORT gearshift_h vpsf vphreg pwron_h vinp vref_out vcntrl_out gd meas_vref
vb vout meas_vcntrl reset_h vinm }
{INST XI21/Mdummy1=pch_18_mac {TYPE MOS} {PROP n="sctg_inv1x/pch_18_mac" Length=0.152 NFIN=8 }
{PIN vpsf=SRC gs_h=DRN vpsf=GATE vpsf=BULK }}
{INST XI21/Mdummy2=nch_18_mac {TYPE MOS} {PROP n="sctg_inv1x/nch_18_mac" Length=0.152 NFIN=5 }
{PIN gs_h=SRC gd=DRN gd=GATE gd=BULK }}
{INST XI20/Mdummy1=pch_18_mac {TYPE MOS} {PROP n="sctg_inv1x/pch_18_mac" Length=0.152 NFIN=8 }
{PIN vpsf=SRC gs_hn=DRN vpsf=GATE vpsf=BULK }}
{INST XI20/Mdummy2=nch_18_mac {TYPE MOS} {PROP n="sctg_inv1x/nch_18_mac" Length=0.152 NFIN=5 }
{PIN gs_hn=SRC gd=DRN gd=GATE gd=BULK }}
{INST XI19/Mdummy1=pch_18_mac {TYPE MOS} {PROP n="sctg_inv1x/pch_18_mac" Length=0.152 NFIN=8 }
{PIN vpsf=SRC net514=DRN vpsf=GATE vpsf=BULK }}
{INST XI19/Mdummy2=nch_18_mac {TYPE MOS} {PROP n="sctg_inv1x/nch_18_mac" Length=0.152 NFIN=5 }
{PIN net514=SRC gd=DRN gd=GATE gd=BULK }}
{INST XI21/MN0=nch_18_mac {TYPE MOS} {PROP n="sctg_inv1x/nch_18_mac" Length=0.152 NFIN=5 }
{PIN gd=SRC gs_h=DRN gs_hn=GATE gd=BULK }}
{INST XI21/MP0=pch_18_mac {TYPE MOS} {PROP n="sctg_inv1x/pch_18_mac" Length=0.152 NFIN=8 }
{PIN vpsf=SRC gs_h=DRN gs_hn=GATE vpsf=BULK }}
{INST XI20/MN0=nch_18_mac {TYPE MOS} {PROP n="sctg_inv1x/nch_18_mac" Length=0.152 NFIN=5 }
...
}
}
答案 0 :(得分:4)
首先,您应该检查是否已有可用于您的文件格式的解析器。显然有:Python-based Verilog Parser (currently Netlist only)
如果找不到合适的东西,可以使用一个过多的可用库构建解析器来构建解析器,例如pyparsing。子类化XML解析器似乎不是一个好主意。
答案 1 :(得分:3)
其他人在评论中说:使用现有的解析器。如果不存在,请自行滚动,但使用解析器库。这里例如与Parcon:
from pprint import pprint
from parcon import (Forward, SignificantLiteral, Word, alphanum_chars, Exact,
ZeroOrMore, CharNotIn, concat, OneOrMore)
block = Forward()
hyphen = SignificantLiteral('"')
word = Word(alphanum_chars + '/_.)')
value = word | Exact(hyphen + ZeroOrMore(CharNotIn('"')) + hyphen)[concat]
pair = word + '=' + value
flag = word
attribute = pair | flag | block
head = word
body = ZeroOrMore(attribute)
block << '{' + head + body + '}'
blocks = OneOrMore(block)
with open('<your file name>.txt') as infile:
pprint(blocks.parse_string(infile.read()))
结果:
[('NETLIST',
['topblock',
('VERSION', ['2', '0', '0']),
('CELL',
['topblock',
('PORT',
['gearshift_h',
'vpsf',
'vphreg',
'pwron_h',
'vinp',
'vref_out',
'vcntrl_out',
'gd',
'meas_vref',
'vb',
'vout',
'meas_vcntrl',
'reset_h',
'vinm']),
('INST',
[('XI21/Mdummy1', 'pch_18_mac'),
('TYPE', ['MOS']),
('PROP',
[('n', '"sctg_inv1x/pch_18_mac"'),
('Length', '0.152'),
('NFIN', '8')]),
('PIN',
[('vpsf', 'SRC'),
('gs_h', 'DRN'),
('vpsf', 'GATE'),
('vpsf', 'BULK')])]),
('INST',
...