我有一个嵌套列表和字典树,我需要递归遍历并删除符合特定条件的整个字典。例如,我需要删除所有没有子项的“类型”“文件夹”的词典(或者一个空子列表)。
我仍然是初学者,所以请原谅蛮力。
这是一个格式化的示例字典,便于复制和粘贴。
{'children': [{'children': [{'key': 'group-1',
'name': 'PRD',
'parent': 'dc-1',
'type': 'Folder'},
{'children': [{'key': 'group-11',
'name': 'App1',
'parent': 'group-2',
'type': 'Folder'}],
'key': 'group-2',
'name': 'QA',
'parent': 'dc-1',
'type': 'Folder'},
{'key': 'group-3',
'name': 'Keep',
'parent': 'dc-1',
'type': 'Host'}],
'key': 'dc-1',
'name': 'ABC',
'parent': 'root',
'type': 'Datacenter'}],
'key': 'root',
'name': 'Datacenters',
'parent': None,
'type': 'Folder'}
在这本词典中,唯一应该保留的树是/ root / dc-1 / group-3。应首先删除group-11文件夹,然后删除其父级(因为子级不再存在),等等。
我尝试了许多不同的递归方法,但似乎无法让它正常工作。任何帮助将不胜感激。
def cleanup(tree):
def inner(tree):
if isinstance(tree, dict):
if 'type' in tree and tree['type'] == 'Folder':
if 'children' not in tree or not tree['children']:
print 'Deleting tree: ' + str(tree['name'])
if str(tree['key']) not in del_nodes:
del_nodes.append(str(tree['key']))
else:
for item in tree.values():
inner(item)
# Delete empty folders here
if del_nodes:
print 'Perform delete here'
if 'children' in tree and isinstance(tree['children'], (list, tuple)):
getvals = operator.itemgetter('key')
tree['children'].sort(key=getvals)
result = []
# groupby is the wrong method. I need a list of tree['children'] that doesn't contain keys in del_nodes
for k, g in itertools.groupby(tree['children'], getvals):
result.append(g.next())
tree['children'][:] = result
del_nodes = []
else:
for item in tree.values():
inner(item)
elif isinstance(tree, (list, tuple)):
for item in tree:
inner(item)
if isinstance(item, dict):
if 'type' in item and item['type'] == 'Folder':
if 'children' not in item or not item['children']:
print 'Delete ' + str(item['name'])
if str(item['key']) not in del_nodes:
del_nodes.append(str(item['key']))
elif isinstance(item, (list, tuple)):
if not item:
print 'Delete ' + str(item['name'])
if str(item['key']) not in del_nodes:
del_nodes.append(str(item['key']))
inner(tree)
答案 0 :(得分:5)
我建议你编写一个函数来遍历你的数据结构并在每个节点上调用一个函数。
更新以避免“从迭代序列中删除项目”错误
<强> E.g。强>
def walk(node,parent=None,func=None):
for child in list(node.get('children',[])):
walk(child,parent=node,func=func)
if func is not None:
func(node,parent=parent)
def removeEmptyFolders(node,parent):
if node.get('type') == 'Folder' and len(node.get('children',[])) == 0:
parent['children'].remove(node)
d = {'children': [{'children': [{'key': 'group-1',
'name': 'PRD',
'parent': 'dc-1',
'type': 'Folder'},
{'children': [{'key': 'group-11',
'name': 'App1',
'parent': 'group-2',
'type': 'Folder'}],
'key': 'group-2',
'name': 'QA',
'parent': 'dc-1',
'type': 'Folder'},
{'key': 'group-3',
'name': 'Keep',
'parent': 'dc-1',
'type': 'Host'}],
'key': 'dc-1',
'name': 'ABC',
'parent': 'root',
'type': 'Datacenter'}],
'key': 'root',
'name': 'Datacenters',
'parent': None,
'type': 'Folder'}
备注强>
parent['children'].remove(child)
for child in list(node.get('children',[]))
函数中的walk
复制子项列表,允许从父项的键中删除条目而不跳过。,然后强>:
>>> walk(d,func=removeEmptyFolders)
>>> from pprint import pprint
>>> pprint(d)
{'children': [{'children': [{'key': 'group-3',
'name': 'Keep',
'parent': 'dc-1',
'type': 'Host'}],
'key': 'dc-1',
'name': 'ABC',
'parent': 'root',
'type': 'Datacenter'}],
'key': 'root',
'name': 'Datacenters',
'parent': None,
'type': 'Folder'}