以下代码给出了错误'没有这样的属性_ParseResuls__tokdict'当在具有多行的输入上运行时。
使用单行文件时,没有错误。如果我注释掉这里显示的第二行或第三行,那么无论文件有多长,我都不会得到错误。
for line in input:
final = delimitedList(expr).parseString(line)
notid = delimitedList(notid).parseString(line)
dash_tags = ', '.join(format_tree(notid))
print final.lineId + ": " + dash_tags
有谁知道这里发生了什么?
编辑:正如所建议的那样,我添加了完整的代码以允许其他人重现错误。
from pyparsing import *
#first are the basic elements of the expression
#number at the beginning of the line, unique for each line
#top-level category for a sentiment
#semicolon should eventually become a line break
lineId = Word(nums)
topicString = Word(alphanums+'-'+' '+"'")
semicolon = Literal(';')
#call variable early to allow for recursion
#recursive function allowing for a line id at first, then the topic,
#then any subtopics, and so on. Finally, optional semicolon and repeat.
#set results name lineId.lineId here
expr = Forward()
expr << Optional(lineId.setResultsName("lineId")) + topicString.setResultsName("topicString") + \
Optional(nestedExpr(content=delimitedList(expr))).setResultsName("parenthetical") + \
Optional(Suppress(semicolon).setResultsName("semicolon") + expr.setResultsName("subsequentlines"))
notid = Suppress(lineId) + topicString + \
Optional(nestedExpr(content=delimitedList(expr))) + \
Optional(Suppress(semicolon) + expr)
#naming the parenthetical portion for independent reference later
parenthetical = nestedExpr(content=delimitedList(expr))
#open files for read and write
input = open('parserinput.txt')
output = open('parseroutput.txt', 'w')
#defining functions
#takes nested list output of parser grammer and translates it into
#strings suited for the final output
def format_tree(tree):
prefix = ''
for node in tree:
if isinstance(node, basestring):
prefix = node
yield node
else:
for elt in format_tree(node):
yield prefix + '_' + elt
#function for passing tokens from setResultsName
def id_number(tokens):
#print tokens.dump()
lineId = tokens
lineId["lineId"] = lineId.lineId
def topic_string(tokens):
topicString = tokens
topicString["topicString"] = topicString.topicString
def parenthetical_fun(tokens):
parenthetical = tokens
parenthetical["parenthetical"] = parenthetical.parenthetical
#function for splitting line at semicolon and appending numberId
#not currently in use
def split_and_prepend(tokens):
return '\n' + final.lineId
#setting parse actions
lineId.setParseAction(id_number)
topicString.setParseAction(topic_string)
parenthetical.setParseAction(parenthetical)
#reads each line in the input file
#calls the grammar expressed in 'expr' and uses it to read the line and assign names to the tokens for later use
#calls the 'notid' varient to easily return the other elements in the line aside from the lineId
#applies the format tree function and joins the tokens in a comma-separated string
#prints the lineId + the tokens from that line
for line in input:
final = delimitedList(expr).parseString(line)
notid = delimitedList(notid).parseString(line)
dash_tags = ', '.join(format_tree(notid))
print final.lineId + ": " + dash_tags
输入文件是一个txt文档,包含以下两行:
1768 dummy; data
1768 dummy data; price
答案 0 :(得分:2)
在notid
中使用时,重新分配delimitedList
会中断第二次迭代。您的第三行会破坏代码中前面定义的notid
表达式,因此它只会在第一次迭代时起作用。为notid赋值使用不同的名称。