扩展名:recursing a dictionary of lists of dictionaries, etc et al (python)
我正在处理4个级别的嵌套字典结构,我正在尝试迭代整个嵌套字典并为每个单独的字典提供一个标识号(作为构建项目树的前兆并且能够告诉它哪个项目节点是父节点,节点具有哪些子节点等。)
我有这个功能:
def r(y):
cnt = 1
def recurse(y, count):
for i in y.iteritems():
count+=1
i['id'] = count
for k,v in y.iteritems():
if isinstance(v, list):
[recurse(i, count) for i in v]
else:
pass
recurse(y, cnt)
return y
我输入了我的字典列表的嵌套词典
我弄得一团糟,也就是说我的想法没那么好。
{'sections': [{'id': 11, 'info': 'This is section ONE', 'tag': 's1'},
{'fields': [{'id': 15,
'info': 'This is field ONE',
'tag': 'f1'},
{'elements': [{'id': 20,
'info': 'This is element',
'tag': 'e1',
'type_of': 'text_field'},
{'id': 20,
'info': 'This is element',
'tag': 'e2',
'type_of': 'text_field'},
{'id': 20,
'info': 'This is element',
'tag': 'e3',
'type_of': 'text_field'},
{'id': 20,
'info': 'This is element',
'tag': 'e4',
'type_of': 'text_field'}],
'id': 16,
'info': 'This is field TWO',
'tag': 'f2'},
{'elements': [{'id': 20,
'info': 'This is element',
'tag': 'e5',
'type_of': 'text_field'},
{'id': 20,
'info': 'This is element',
'tag': 'e6',
'type_of': 'text_field'},
{'id': 20,
'info': 'This is element',
'tag': 'e7',
'type_of': 'text_field'},
{'id': 20,
'info': 'This is element ONE',
'tag': 'e8',
'type_of': 'text_field'}],
'id': 16,
'info': 'This is field THREE',
'tag': 'f3'}],
'id': 12,
'info': 'This is section TWO',
'tag': 's2'},
{'fields': [{'id': 15,
'info': 'This is field FOUR',
'tag': 'f4'},
{'id': 15,
'info': 'This is field FIVE',
'tag': 'f5'},
{'id': 15,
'info': 'This is field SIX',
'tag': 'f6'}],
'id': 12,
'info': 'This is section THREE',
'tag': 's3'}],
'tag': 'test'}
我想要发生的是,第一级中的所有项目都已编号,然后第二级中的所有项目都被编号,然后是第三级别,然后是第四级别。在这种情况下,主要项目应该被赋予id为1,然后将这些部分标识为2,3,4然后将字段标识为5,然后是元素等。在睡觉之后回顾它,我可以将其视为开始,但非常错误。
编辑:我真正需要做的是从嵌套字典结构中创建一个父/子节点树,以便我可以根据需要迭代/插入/获取/使用此树中的项目。有快速的方法吗?我似乎做的工作超出了我的预期。EDIT2:我找到了original question的解决方案。我刚刚决定使用内置的id()函数而不是添加id的额外步骤,并且能够创建我需要的最小树,但这仍然是一项有用的练习。
答案 0 :(得分:0)
由于您的count
变量是本地变量,因此您将获得重复的ID,并且一旦recurse
函数退出,对其的任何更改都将丢失。您可以通过声明一个全局变量来绕过它,但由于您没有使用recurse
的返回值,您可以使用它:
def r(y):
def recurse(y, count):
y['id'] = count
count += 1
for k,v in y.iteritems():
if isinstance(v, list):
for i in v:
count = recurse(i, count)
return count
recurse(y, 1)
return y
编辑:刚刚意识到你正在寻找一个广度优先的ID分配...这不会实现,但我会留下答案,因为它可能对你有所帮助。< / em>的
答案 1 :(得分:0)
好吧,我有一个使用depth和parent来设置ID的解决方案:
>>> def decorate_tree(tree, parent=None, index=None):
global ID
if type(tree) == type({}):
if parent is None:
parent = '1'
tree['id'] = parent
else:
tree['id'] = '{0}.{1}'.format(parent, index)
if 'info' in tree:
print tree['info'], '=>', tree['id']
child_index = 1
for key in tree:
if type(tree[key]) == type([]):
for item in tree[key]:
decorate_tree(item, tree['id'], child_index)
child_index += 1
>>> decorate_tree(d)
This is section ONE => 1.1
This is section TWO => 1.2
This is field ONE => 1.2.1
This is field TWO => 1.2.2
This is element => 1.2.2.1
This is element => 1.2.2.2
This is element => 1.2.2.3
This is element => 1.2.2.4
This is field THREE => 1.2.3
This is element => 1.2.3.1
This is element => 1.2.3.2
This is element => 1.2.3.3
This is element ONE => 1.2.3.4
This is section THREE => 1.3
This is field FOUR => 1.3.1
This is field FIVE => 1.3.2
This is field SIX => 1.3.3
>>> from pprint import pprint
>>> pprint(d)
{'id': '1',
'sections': [{'id': '1.1', 'info': 'This is section ONE', 'tag': 's1'},
{'fields': [{'id': '1.2.1',
'info': 'This is field ONE',
'tag': 'f1'},
{'elements': [{'id': '1.2.2.1',
'info': 'This is element',
'tag': 'e1',
'type_of': 'text_field'},
{'id': '1.2.2.2',
'info': 'This is element',
'tag': 'e2',
'type_of': 'text_field'},
{'id': '1.2.2.3',
'info': 'This is element',
'tag': 'e3',
'type_of': 'text_field'},
{'id': '1.2.2.4',
'info': 'This is element',
'tag': 'e4',
'type_of': 'text_field'}],
'id': '1.2.2',
'info': 'This is field TWO',
'tag': 'f2'},
{'elements': [{'id': '1.2.3.1',
'info': 'This is element',
'tag': 'e5',
'type_of': 'text_field'},
{'id': '1.2.3.2',
'info': 'This is element',
'tag': 'e6',
'type_of': 'text_field'},
{'id': '1.2.3.3',
'info': 'This is element',
'tag': 'e7',
'type_of': 'text_field'},
{'id': '1.2.3.4',
'info': 'This is element ONE',
'tag': 'e8',
'type_of': 'text_field'}],
'id': '1.2.3',
'info': 'This is field THREE',
'tag': 'f3'}],
'id': '1.2',
'info': 'This is section TWO',
'tag': 's2'},
{'fields': [{'id': '1.3.1',
'info': 'This is field FOUR',
'tag': 'f4'},
{'id': '1.3.2',
'info': 'This is field FIVE',
'tag': 'f5'},
{'id': '1.3.3',
'info': 'This is field SIX',
'tag': 'f6'}],
'id': '1.3',
'info': 'This is section THREE',
'tag': 's3'}],
'tag': 'test',
'type_of': 'custom'}
>>>
因此ID 1.3.4的父级是ID 1.3,兄弟姐妹是ID 1.3.x,子级是1.3.4.x ......这种方式检索和插入不应该太难(移位索引)。
答案 2 :(得分:0)
以下是使用count
迭代器替换itertools.count
变量的解决方案:
from itertools import count
def r(y):
counter = count()
def recurse(y, counter):
for i in y.iteritems():
i['id'] = next(counter)
for k,v in y.iteritems():
if isinstance(v, list):
[recurse(i, counter) for i in v]
else:
pass
recurse(y, counter)
return y
itertools.count()将创建一个生成器,每次调用next()时都会返回下一个整数。您可以将其传递给递归函数,并确保不会创建重复的ID。
答案 3 :(得分:0)
考虑的替代方案是双重链接列表。例如:
Index Tag Parent Children Info
0 test -1 [s1,s2,s3] ""
1 s1 0 [] "This is section ONE"
2 s2 0 [f1,f2,f3] "This is section TWO"
3 f1 2 [] "This is field ONE"
4 f2 2 [e1,e2,e3,e4] "This is field TWO"
5 e1 4 [] "This is element"
6 e2 4 [] "This is element"
.
.
.
这是一个概念表示,实际的实现将使用children列的数字行索引而不是标记,因为您的输入数据可能是脏的,带有重复或缺少的标记,并且您不希望构建一个结构取决于标签是唯一的。可以轻松添加其他列。
您可以通过递归遍历树来构建表,但通过使用平面表(列表的2D列表)中的行来引用它们可能更容易处理树中的项目。
编辑:这是您对原始问题(未修饰的节点列表)的解决方案的扩展,它将结构化信息(标记,父级,子级等)添加到每个节点。如果您需要在树中上下导航,这可能很有用。
编辑:此代码:
def recurse(y, n=[], p=-1):
node = ["", p, [], "", ""] # tag, parent, children, type, info
vv = []
for k,v in y.items():
if k == "tag":
node[0] = v
elif k == "info":
node[4] = v
elif isinstance(v, list):
node[3] = k
vv = v
n.append(node)
p = len(n)-1
for i in vv:
n[p][2].append(len(n))
n = recurse(i, n, p)
return(n)
nodes = recurse(a)
for i in range(len(nodes)):
print(i, nodes[i])
生成(手动间隔为列以便于阅读):
0 ['test', -1, [1, 2, 14], 'sections', '']
1 [ 's1', 0, [], '', 'This is section ONE']
2 [ 's2', 0, [3, 4, 9], 'fields', 'This is section TWO']
3 [ 'f1', 2, [], '', 'This is field ONE']
4 [ 'f2', 2, [5, 6, 7, 8], 'elements', 'This is field TWO']
5 [ 'e1', 4, [], '', 'This is element']
6 [ 'e2', 4, [], '', 'This is element']
7 [ 'e3', 4, [], '', 'This is element']
8 [ 'e4', 4, [], '', 'This is element']
9 [ 'f3', 2, [10, 11, 12, 13], 'elements', 'This is field THREE']
10 [ 'e5', 9, [], '', 'This is element']
11 [ 'e6', 9, [], '', 'This is element']
12 [ 'e7', 9, [], '', 'This is element']
13 [ 'e8', 9, [], '', 'This is element ONE']
14 [ 's3', 0, [15, 16, 17], 'fields', 'This is section THREE']
15 [ 'f4', 14, [], '', 'This is field FOUR']
16 [ 'f5', 14, [], '', 'This is field FIVE']
17 [ 'f6', 14, [], '', 'This is field SIX']