从dict主列表中创建不同的dicts列表

时间:2014-08-22 06:51:33

标签: python list dictionary

我有一个字典列表如下: -

listDict =[{'name':'A',
        'fun':'funA', 
        'childs':[{'name':'B',
                   'fun':'funB', 
                   'childs':[{ 'name':'D',
                               'fun':'funD'}]}, 
                  {'name':'C',
                   'fun':'funC', 
                   'childs':[{ 'name':'E',
                               'fun':'funE'},
                             { 'name':'F',
                               'fun':'funF'},
                             { 'name':'G',
                               'fun':'funG',
                               'childs' :[{ 'name':'H',
                                            'fun':'funH'}]}]}]}, 
       {'name':'Z',
        'fun':'funZ'}]

我想从中创建三个dict列表: - 1.没有孩子,没有父母

lod1 = [{'name':'Z'
         'fun':'funZ'}]

2.没有孩子但父母和父母是关键: -

`lod2 = [{'B':[{ 'name':'D',
                 'fun':'funD'}]},
        {'C':[{'name':'E',
               'fun':'funE'},
              {'name':'F',
               'fun':'funF'}]},
        {'G':[{ 'name':'H',
                'fun':'funH'}]
        }]`

3.只有父母子女作为以父母为关键的平面列表: -

lod3 = [{'A': [{ 'name':'B',
                 'fun':'funB'},
               {'name':'C',
               'fun':'funC'}]},
        {'C': [{'name':'G',
                'fun':'funG'}]
        }]

有没有任何可能的方法来做或不带递归。这种划分的目的是我试图创建一个平面类结构,其中类别1中的所有节点(没有子节点和父节点)作为最终类的函数添加。所有没有子节点但具有父节点(类别2)的节点被添加为相应父类的函数。剩下的父子(类别3)将被创建为具有子实例的类。

1 个答案:

答案 0 :(得分:0)

这是Visitor Pattern适合的任务。你有一个类似树的结构,你希望遍历它,积累三组不同的信息。

为了更好地实现这一点,您应该将结构的遍历与数据集合分开。这样,您只需要定义不同形式的数据收集,而不是每次重新填充访问者。那么让我们开始吧。

访问者将使用字典,考虑它,并将访问childs列表中的所有字典(您可能希望将其重命名为 children )。

from abc import ABCMeta, abstractmethod

class DictionaryVisitor(object):
    __metaclass__ = ABCMeta

    @abstractmethod
    def visit(self, node, parents, result):
        """ This inspects the current node and
            accumulates any data into result """
        pass  # Implement this in the subclasses

    def accept(self, node, parents, result):
        """ This tests if the node should be traversed. This is an efficiency
            improvement to prevent traversing lots of nodes that you have no
            interest in """
        return True

    def traverse(self, node, parents, result):
        """ This traverses the dictionary, visiting each node in turn """
        if not self.accept(node, parents, result):
            return

        self.visit(node, parents, result)
        if 'childs' in node:
            for child in node['childs']:
                self.traverse(child, parents + [node], result)

    def start(self, dict_list):  # bad method name
        """ This just handles the parents and result argument of traverse """
        # Assuming that result is always a list is not normally appropriate
        result = []
        for node in dict_list:
            self.traverse(node, [], result)
        return result

然后,您可以将不同的必需输出实现为此抽象基类的子类:

class ParentlessChildlessVisitor(DictionaryVisitor):
    def visit(self, node, parents, result):
        """ Collect the nodes that have no parents or children """
        # parent filtering is performed in accept
        if 'childs' not in node:
            result.append(node)

    def accept(self, nodes, parents, result):
        """ Reject all nodes with parents """
        return not parents

然后你可以打电话给它:

visitor = ParentlessChildlessVisitor()
results = visitor.start(data)
print results
# prints [{'fun': 'funZ', 'name': 'Z'}]

下一个:

class ChildlessChildVisitor(DictionaryVisitor):
    def visit(self, node, parents, result):
        """ Collect the nodes that have parents but no children """
        if parents and 'childs' not in node:
            # slightly odd data structure here, a list of dicts where the only
            # dict key is unique. It would be better to be a plain dict, which
            # is what is done here:
            result[parents[-1]['name']].append(node)

    def start(self, dict_list):
        """ This just handles the parents and result argument of traverse """
        # Here it is much better to have a dict as the result.
        # This is an example of why wrapping all this logic in the start method
        # is not normally appropriate.
        result = defaultdict(list)
        for node in dict_list:
            self.traverse(node, [], result)
        return result

visitor = ChildlessChildVisitor()
results = visitor.start(listDict)
print dict(results)
# prints {'C': [{'fun': 'funE', 'name': 'E'}, {'fun': 'funF', 'name': 'F'}], 'B': [{'fun': 'funD', 'name': 'D'}], 'G': [{'fun': 'funH', 'name': 'H'}]}

我不完全清楚你想用最后一个例子收集什么,所以你必须自己处理那个。