前言:为了帮助解释为什么我这样做,我将解释最终目标。基本上我有一个以非常具体的语法定义的帐户列表。以下是一些例子:
Assets:Bank:Car
Assets:Bank:House
Assets:Savings:Emergency
Assets:Savings:Goals:Roof
Assets:Reserved
如上所示,帐户可以包含任意数量的父母和子女。最终目标是将上述帐户解析为Python中的树结构,用于在Sublime文本编辑器中提供帐户自动完成(即,如果我键入 Assets:,然后查询为auto - 完成后,我会看到一个列表:银行,储蓄,保留)
结果:使用前言中的帐户列表,Python中的所需结果如下所示:
[
{
"Assets":[
{
"Bank":[
"Car",
"House"
]
},
{
"Savings":[
"Emergency",
{
"Goals":[
"Roof"
]
}
]
},
"Reserved"
]
}
]
半解决方案:我能够使用递归将两个基本帐户添加到一起。这适用于添加以下两项:Assets:Bank:Car
和Assets:Bank:House
。但是,一旦它们开始出现差异,它就会开始崩溃并且递归变得混乱,所以我不确定它是否是最佳方式。
import re
def parse_account(account_str):
subs = account_str.split(":")
def separate(subs):
if len(subs) == 1:
return subs
elif len(subs):
return [{subs[0]: separate(subs[1:])}]
return separate(subs)
def merge_dicts(a, b):
# a will be a list with dictionaries and text values and then nested lists/dictionaries/text values
# b will always be a list with ONE dictionary or text value
key = b[0].keys()[0] # this is the dictionary key of the only dictionary in the b list
for item in a: # item is a dictionary or a text value
if isinstance(item, dict): # if item is a dictionary
if key in item:
# Is the value a list with a dict or a list with a text value
if isinstance(b[0][key][0], str):
# Extend the current list with the new value
item[key].extend(b[0][key])
else:
# Recurse to the next child
merge_dicts(item[key], b[0][key])
else:
return a
# Accounts have an "open [name]" syntax for defining them
text = "open Assets:Bank:Car\nopen Assets:Bank:House\nopen Assets:Savings:Emergency\nopen Assets:Savings:Goals:Roof\nopen Assets:Reserved"
EXP = re.compile("open (.*)")
accounts = EXP.findall(text) # This grabs all accounts
# Create a list of all the parsed accounts
dicts = []
for account in accounts:
dicts.append(parse_account(account))
# Attempt to merge two accounts together
final = merge_dicts(dicts[0], dicts[1])
print final
# In the future we would call: reduce(merge_dicts, dicts) to merge all accounts
我可能会以完全错误的方式解决这个问题,我会对不同意见感兴趣。否则,是否有人深入了解如何使用示例字符串中的其余帐户进行此操作?
答案 0 :(得分:3)
我花了很多时间在脑海中梳理。字典很简单,一个键总是有一个列表作为值 - 它们习惯于有一个命名列表。
列表内部将是一个字符串或另一个字典(带有列表的键)。
这意味着我们可以拆分'资产:银行:汽车'并在匹配{"Assets":[<whatever>]}
的根列表中查找字典或添加一个字典 - 然后再向两个级别跳转到[<whatever>]
列表。接下来循环,查找匹配{"Bank":[<whatever>]}
的字典,或添加一个字典,跳转到更深层次的[<whatever>]
列表。继续这样做,直到我们到达最后一个节点Car
。我们必须在 a 列表上,因为我们总是跳转到现有列表或创建新列表,因此将Car
放在当前列表中。
NB。如果你有
,这种方法会破裂Assets:Reserved
Assets:Reserved:Painting
但这将是一个无意义的冲突输入,要求“保留”既是叶节点又是容器,在那种情况下你只会:
Assets:Reserved:Painting
正确?
data = """
Assets:Bank:Car
Assets:Bank:House
Assets:Savings:Emergency
Assets:Savings:Goals:Roof
Assets:Reserved
"""
J = []
for line in data.split('\n'):
if not line: continue
# split the line into parts, start at the root list
# is there a dict here for this part?
# yes? cool, dive into it for the next loop iteration
# no? add one, with a list, ready for the next loop iteration
# (unless we're at the final part, then stick it in the list
# we made/found in the previous loop iteration)
parts = line.split(':')
parent_list, current_list = J, J
for index, part in enumerate(parts):
for item in current_list:
if part in item:
parent_list, current_list = current_list, item[part]
break
else:
if index == len(parts) - 1:
# leaf node, add part as string
current_list.append(part)
else:
new_list = []
current_list.append({part:new_list})
parent_list, current_list = current_list, new_list
print J
- &GT;
[{'Assets': [{'Bank': ['Car', 'House']}, {'Savings': ['Emergency', {'Goals': ['Roof']}]}, 'Reserved']}]
在线试用:https://repl.it/Ci5L