我正在使用parsimonious
来解析一些csv
。我的问题是生成的输出没有按预期的顺序出现。例如,如果输入字符串是
Load,File,Sample
然后我希望得到:
import database from Sample
我得到的是:
from Sample import database
对于我尝试的每个输入,这是一个一致的问题:第一个标记是条目OrderedDict中的最后一个项目,但我无法弄清楚原因。
这是我的代码:
from parsimonious.grammar import Grammar
from parsimonious.nodes import NodeVisitor
from collections import OrderedDict
class EntryParser(NodeVisitor):
def __init__(self, grammar, text):
self.entry = OrderedDict()
ast = Grammar(grammar).parse(text)
self.visit(ast)
def visit_alt(self, n, vc):
self.entry['alt'] = "alter "
def visit_load(self, n, vc):
self.entry['load'] = "import database "
def visit_app(self, n, vc):
self.entry['app'] = "application "
def visit_db(self, n, vc):
self.entry['db'] = "database "
def visit_filter(self, n, vc):
self.entry['filter'] = "filter "
def visit_group(self, n, vc):
self.entry['group'] = "group "
def visit_obj(self, n, vc):
self.entry['obj'] = "object "
def visit_trigger(self, n, vc):
self.entry['trigger'] = "trigger "
def visit_user(self, n, vc):
self.entry['user'] = "user "
def visit_sql(self, n, vc):
self.entry['sql'] = "connect as "
def visit_file(self, n, vc):
self.entry['file'] = "from "
def visit_dbname(self, n, vc):
self.entry['dbname'] = n.text + " "
def generic_visit(self, n, vc):
pass
grammar = """\
ts0 = alt / load
sep = ","
alt = "Alt" sep altdomain
altdomain = app / db / filter / group / obj / trigger / user
load = "Load" sep loaddomain
loaddomain = (sql / file) sep dbname
sql = "SQL"
file = "File"
app = "App"
db = "DB"
filter = "Filter"
group = "Group"
obj = "Object"
trigger = "Trigger"
user = "User"
dbname = ~"[A-z]+"
"""
text = """\
Alt,Filter
Alt,App
Alt,DB
Alt,Group,
Alt,Object
Alt,Trigger
Alt,User
Load,SQL,Sample
Load,File,Sample
"""
for line in text.splitlines():
for v in EntryParser(grammar, line).entry.values():
print(v, end="")
print('\n')
答案 0 :(得分:0)
我认为OrderedDict会导致问题。只需使用常规词典{}。
基本上将self.entry = OrderedDict()
更改为self.entry = {}
此外,基于语法的解析工作方式,首先是最内层元素匹配,然后它将移向外部匹配规则(就返回值而言)。
因此,为了正确获得正确的顺序,您必须使用堆栈(或使用常规列表并反转数组)。