在字典列表的嵌套字典中为每个递归项提供id

时间:2012-09-26 12:21:39

标签: python dictionary tree loops

扩展名:recursing a dictionary of lists of dictionaries, etc et al (python)

我正在处理4个级别的嵌套字典结构,我正在尝试迭代整个嵌套字典并为每个单独的字典提供一个标识号(作为构建项目树的前兆并且能够告诉它哪个项目节点是父节点,节点具有哪些子节点等。)

我有这个功能:

def r(y):
    cnt = 1
    def recurse(y, count):
        for i in y.iteritems():
            count+=1
            i['id'] = count
            for k,v in y.iteritems():
                if isinstance(v, list):
                    [recurse(i, count) for i in v]
                else:
                    pass
    recurse(y, cnt)
    return y

我输入了我的字典列表的嵌套词典

我弄得一团糟,也就是说我的想法没那么好。

{'sections': [{'id': 11, 'info': 'This is section ONE', 'tag': 's1'},
              {'fields': [{'id': 15,
                           'info': 'This is field ONE',
                           'tag': 'f1'},
                          {'elements': [{'id': 20,
                                         'info': 'This is element',
                                         'tag': 'e1',
                                         'type_of': 'text_field'},
                                        {'id': 20,
                                         'info': 'This is element',
                                         'tag': 'e2',
                                         'type_of': 'text_field'},
                                        {'id': 20,
                                         'info': 'This is element',
                                         'tag': 'e3',
                                         'type_of': 'text_field'},
                                        {'id': 20,
                                         'info': 'This is element',
                                         'tag': 'e4',
                                         'type_of': 'text_field'}],
                           'id': 16,
                           'info': 'This is field TWO',
                           'tag': 'f2'},
                          {'elements': [{'id': 20,
                                         'info': 'This is element',
                                         'tag': 'e5',
                                         'type_of': 'text_field'},
                                        {'id': 20,
                                         'info': 'This is element',
                                         'tag': 'e6',
                                         'type_of': 'text_field'},
                                        {'id': 20,
                                         'info': 'This is element',
                                         'tag': 'e7',
                                         'type_of': 'text_field'},
                                        {'id': 20,
                                         'info': 'This is element ONE',
                                         'tag': 'e8',
                                         'type_of': 'text_field'}],
                           'id': 16,
                           'info': 'This is field THREE',
                           'tag': 'f3'}],
               'id': 12,
               'info': 'This is section TWO',
               'tag': 's2'},
              {'fields': [{'id': 15,
                           'info': 'This is field FOUR',
                           'tag': 'f4'},
                          {'id': 15,
                           'info': 'This is field FIVE',
                           'tag': 'f5'},
                          {'id': 15,
                           'info': 'This is field SIX',
                           'tag': 'f6'}],
               'id': 12,
               'info': 'This is section THREE',
               'tag': 's3'}],
 'tag': 'test'}

我想要发生的是,第一级中的所有项目都已编号,然后第二级中的所有项目都被编号,然后是第三级别,然后是第四级别。在这种情况下,主要项目应该被赋予id为1,然后将这些部分标识为2,3,4然后将字段标识为5,然后是元素等。在睡觉之后回顾它,我可以将其视为开始,但非常错误。

编辑:我真正需要做的是从嵌套字典结构中创建一个父/子节点树,以便我可以根据需要迭代/插入/获取/使用此树中的项目。有快速的方法吗?我似乎做的工作超出了我的预期。

EDIT2:我找到了original question的解决方案。我刚刚决定使用内置的id()函数而不是添加id的额外步骤,并且能够创建我需要的最小树,但这仍然是一项有用的练习。

4 个答案:

答案 0 :(得分:0)

由于您的count变量是本地变量,因此您将获得重复的ID,并且一旦recurse函数退出,对其的任何更改都将丢失。您可以通过声明一个全局变量来绕过它,但由于您没有使用recurse的返回值,您可以使用它:

def r(y):
    def recurse(y, count):
        y['id'] = count
        count += 1
        for k,v in y.iteritems():
            if isinstance(v, list):
                for i in v:
                    count = recurse(i, count)
        return count
    recurse(y, 1)
    return y

编辑:刚刚意识到你正在寻找一个广度优先的ID分配...这不会实现,但我会留下答案,因为它可能对你有所帮助。< / em>的

答案 1 :(得分:0)

好吧,我有一个使用depth和parent来设置ID的解决方案:

>>> def decorate_tree(tree, parent=None, index=None):
    global ID
    if type(tree) == type({}):
        if parent is None:
            parent = '1'
            tree['id'] = parent
        else:
            tree['id'] = '{0}.{1}'.format(parent, index)
        if 'info' in tree:
            print tree['info'], '=>', tree['id']
        child_index = 1
        for key in tree:
            if type(tree[key]) == type([]):
                for item in tree[key]:
                    decorate_tree(item, tree['id'], child_index)
                    child_index += 1


>>> decorate_tree(d)
This is section ONE => 1.1
This is section TWO => 1.2
This is field ONE => 1.2.1
This is field TWO => 1.2.2
This is element => 1.2.2.1
This is element => 1.2.2.2
This is element => 1.2.2.3
This is element => 1.2.2.4
This is field THREE => 1.2.3
This is element => 1.2.3.1
This is element => 1.2.3.2
This is element => 1.2.3.3
This is element ONE => 1.2.3.4
This is section THREE => 1.3
This is field FOUR => 1.3.1
This is field FIVE => 1.3.2
This is field SIX => 1.3.3
>>> from pprint import pprint
>>> pprint(d)
{'id': '1',
 'sections': [{'id': '1.1', 'info': 'This is section ONE', 'tag': 's1'},
              {'fields': [{'id': '1.2.1',
                           'info': 'This is field ONE',
                           'tag': 'f1'},
                          {'elements': [{'id': '1.2.2.1',
                                         'info': 'This is element',
                                         'tag': 'e1',
                                         'type_of': 'text_field'},
                                        {'id': '1.2.2.2',
                                         'info': 'This is element',
                                         'tag': 'e2',
                                         'type_of': 'text_field'},
                                        {'id': '1.2.2.3',
                                         'info': 'This is element',
                                         'tag': 'e3',
                                         'type_of': 'text_field'},
                                        {'id': '1.2.2.4',
                                         'info': 'This is element',
                                         'tag': 'e4',
                                         'type_of': 'text_field'}],
                           'id': '1.2.2',
                           'info': 'This is field TWO',
                           'tag': 'f2'},
                          {'elements': [{'id': '1.2.3.1',
                                         'info': 'This is element',
                                         'tag': 'e5',
                                         'type_of': 'text_field'},
                                        {'id': '1.2.3.2',
                                         'info': 'This is element',
                                         'tag': 'e6',
                                         'type_of': 'text_field'},
                                        {'id': '1.2.3.3',
                                         'info': 'This is element',
                                         'tag': 'e7',
                                         'type_of': 'text_field'},
                                        {'id': '1.2.3.4',
                                         'info': 'This is element ONE',
                                         'tag': 'e8',
                                         'type_of': 'text_field'}],
                           'id': '1.2.3',
                           'info': 'This is field THREE',
                           'tag': 'f3'}],
               'id': '1.2',
               'info': 'This is section TWO',
               'tag': 's2'},
              {'fields': [{'id': '1.3.1',
                           'info': 'This is field FOUR',
                           'tag': 'f4'},
                          {'id': '1.3.2',
                           'info': 'This is field FIVE',
                           'tag': 'f5'},
                          {'id': '1.3.3',
                           'info': 'This is field SIX',
                           'tag': 'f6'}],
               'id': '1.3',
               'info': 'This is section THREE',
               'tag': 's3'}],
 'tag': 'test',
 'type_of': 'custom'}
>>> 

因此ID 1.3.4的父级是ID 1.3,兄弟姐妹是ID 1.3.x,子级是1.3.4.x ......这种方式检索和插入不应该太难(移位索引)。

答案 2 :(得分:0)

以下是使用count迭代器替换itertools.count变量的解决方案:

from itertools import count
def r(y):
    counter = count()
    def recurse(y, counter):
        for i in y.iteritems():
            i['id'] = next(counter)
            for k,v in y.iteritems():
                if isinstance(v, list):
                    [recurse(i, counter) for i in v]
                else:
                    pass
    recurse(y, counter)
    return y

itertools.count()将创建一个生成器,每次调用next()时都会返回下一个整数。您可以将其传递给递归函数,并确保不会创建重复的ID。

答案 3 :(得分:0)

考虑的替代方案是双重链接列表。例如:

Index  Tag     Parent  Children        Info
0      test    -1      [s1,s2,s3]      ""
1      s1      0       []              "This is section ONE"
2      s2      0       [f1,f2,f3]      "This is section TWO"
3      f1      2       []              "This is field ONE"
4      f2      2       [e1,e2,e3,e4]   "This is field TWO"
5      e1      4       []              "This is element"
6      e2      4       []              "This is element"
       .
       .
       .

这是一个概念表示,实际的实现将使用children列的数字行索引而不是标记,因为您的输入数据可能是脏的,带有重复或缺少的标记,并且您不希望构建一个结构取决于标签是唯一的。可以轻松添加其他列。

您可以通过递归遍历树来构建表,但通过使用平面表(列表的2D列表)中的行来引用它们可能更容易处理树中的项目。

编辑:这是您对原始问题(未修饰的节点列表)的解决方案的扩展,它将结构化信息(标记,父级,子级等)添加到每个节点。如果您需要在树中上下导航,这可能很有用。

编辑:此代码:

def recurse(y, n=[], p=-1):
    node = ["", p, [], "", ""]   # tag, parent, children, type, info
    vv = []
    for k,v in y.items():
        if k == "tag":
            node[0] = v
        elif k == "info":
            node[4] = v
        elif isinstance(v, list):
            node[3] = k
            vv = v
    n.append(node)
    p = len(n)-1
    for i in vv:
        n[p][2].append(len(n))
        n = recurse(i, n, p)
    return(n)

nodes = recurse(a)
for i in range(len(nodes)):
    print(i, nodes[i])

生成(手动间隔为列以便于阅读):

 0 ['test', -1, [1, 2, 14],     'sections',   '']
 1 [  's1',  0, [],             '',           'This is section ONE']
 2 [  's2',  0, [3, 4, 9],      'fields',     'This is section TWO']
 3 [  'f1',  2, [],             '',           'This is field ONE']
 4 [  'f2',  2, [5, 6, 7, 8],   'elements',   'This is field TWO']
 5 [  'e1',  4, [],             '',           'This is element']
 6 [  'e2',  4, [],             '',           'This is element']
 7 [  'e3',  4, [],             '',           'This is element']
 8 [  'e4',  4, [],             '',           'This is element']
 9 [  'f3',  2, [10, 11, 12, 13], 'elements', 'This is field THREE']
10 [  'e5',  9, [],             '',           'This is element']
11 [  'e6',  9, [],             '',           'This is element']
12 [  'e7',  9, [],             '',           'This is element']
13 [  'e8',  9, [],             '',           'This is element ONE']
14 [  's3',  0, [15, 16, 17],   'fields',     'This is section THREE']
15 [  'f4', 14, [],             '',           'This is field FOUR']
16 [  'f5', 14, [],             '',           'This is field FIVE']
17 [  'f6', 14, [],             '',           'This is field SIX']