我正在尝试用python解析大文本文件。
这些文件的语法如下:
<option1> {
<variable1>=<value1>; //<comment>
<variable2>=<value2>;
..
<variableN>=<valueN>; //<comment>
}
<option2> {
<variable1>=<value1>; //<comment>
<variable2>=<value2>;
..
<variableN>=<valueN>; //<comment>
}
...
...
<optionN> {
<variable1>=<value1>; //<comment>
<variable2>=<value2>;
..
<variableN>=<valueN>; //<comment>
}
我希望获得<optionK>[<variableT>]
值。
使用fileparser是否有最佳方法?
答案 0 :(得分:1)
考虑您的上述示例(丑陋的解决方案),您可以使用http://docs.python.org/2/library/htmlparser.html,如下所示:
test = """
<option1> {
<variable1>=<value1>; //<comment>
<variable2>=<value2>;
..
<variableN>=<valueN>; //<comment>
}
<option2> {
<variable1>=<value1>; //<comment>
<variable2>=<value2>;
..
<variableN>=<valueN>; //<comment>
}
...
...
<optionN> {
<variable1>=<value1>; //<comment>
<variable2>=<value2>;
..
<variableN>=<valueN>; //<comment>
}
"""
from HTMLParser import HTMLParser
# create a subclass and override the handler methods
class MyHTMLParser(HTMLParser):
option = ""
key = ""
value = ""
r = {}
def handle_starttag(self, tag, attrs):
self.currentTag = tag
print "Encountered a start tag:", tag
if "option" in tag:
#self.r = {}
self.option = tag
self.r[self.option] = {}
elif "{" in self.currentData or "=" not in self.currentData and "//" not in self.currentData:
self.key = tag
self.r[self.option][self.key] = ""
elif "=" in self.currentData:
self.value = tag
self.r[self.option][self.key] = self.value
#print self.r
def handle_endtag(self, tag):
self.currentData = None
print "Encountered an end tag :", tag
def handle_data(self, data):
self.currentData = data
print "Encountered some data :", data
#find a condition to yield result here "}" ?
# instantiate the parser and fed it some HTML
parser = MyHTMLParser()
parser.feed(test)
print parser.r