使用python从平面列表创建嵌套字典

时间:2014-04-15 16:04:06

标签: algorithm python-3.x recursion

我有一个这种形式的文件列表:

base/images/graphs/one.png
base/images/tikz/two.png
base/refs/images/three.png
base/one.txt
base/chapters/two.txt

我想将它们转换为这种嵌套字典:

{ "name": "base" , "contents": 
  [{"name": "images" , "contents":
    [{"name": "graphs", "contents":[{"name":"one.png"}] },
     {"name":"tikz",     "contents":[{"name":"two.png"}]}
    ]
   }, 
   {"name": "refs", "contents":
    [{"name":"images", "contents": [{"name":"three.png"}]}]
   },
   {"name":"one.txt",  },
   {"name": "chapters", "contents":[{"name":"two.txt"}]
  ]
 }
麻烦的是,我的尝试解决方案,给出一些像images/datasetone/grapha.png" ,"images/datasetone/graphb.png"这样的输入,每一个输入都会出现在另一个名为" datasetone"但是,我希望两者都在同一个父目录中,因为它们位于同一目录中,当共同路径中有多个文件时,如何创建这个嵌套结构而不复制父词典? / p>

这是我提出的并失败的原因:

def path_to_tree(params):
    start = {}
    for item in params:
        parts = item.split('/')
        depth = len(parts)
        if depth > 1: 
            if "contents" in start.keys():
                start["contents"].append(create_base_dir(parts[0],parts[1:]))
            else:
                start ["contents"] = [create_base_dir(parts[0],parts[1:]) ]
        else:
            if "contents" in start.keys():
                start["contents"].append(create_leaf(parts[0]))
            else:
                start["contents"] =[ create_leaf(parts[0]) ]
    return start


def create_base_dir(base, parts):
    l={}
    if len(parts) >=1:
        l["name"] = base 
        l["contents"] = [  create_base_dir(parts[0],parts[1:]) ]
    elif len(parts)==0:
        l = create_leaf(base)
    return l 


def create_leaf(base): 
    l={}
    l["name"] = base
    return l 

b=["base/images/graphs/one.png","base/images/graphs/oneb.png","base/images/tikz/two.png","base/refs/images/three.png","base/one.txt","base/chapters/two.txt"]
d =path_to_tree(b)
from pprint import pprint
pprint(d)

在这个例子中,你可以看到我们最终得到了很多名为" base"由于列表中有文件,但只需要一个,子目录应列在"内容"阵列。

2 个答案:

答案 0 :(得分:1)

这并不是假设所有路径都以相同的东西开头,所以我们需要一个列表:

from pprint import pprint
def addBits2Tree( bits, tree ):
    if len(bits) == 1:
        tree.append( {'name':bits[0]} )
    else:
        for t in tree:
            if t['name']==bits[0]:
                addBits2Tree( bits[1:], t['contents'] )
                return
        newTree = []
        addBits2Tree( bits[1:], newTree )
        t = {'name':bits[0], 'contents':newTree}
        tree.append( t )

def addPath2Tree( path, tree ):
    bits = path.split("/")
    addBits2Tree( bits, tree )

tree = []
for p in b:
    print p
    addPath2Tree( p, tree )
pprint(tree)

为您的示例路径列表生成以下内容:

[{'contents': [{'contents': [{'contents': [{'name': 'one.png'},
                                           {'name': 'oneb.png'}],
                              'name': 'graphs'},
                             {'contents': [{'name': 'two.png'}],
                              'name': 'tikz'}],
                'name': 'images'},
               {'contents': [{'contents': [{'name': 'three.png'}],
                              'name': 'images'}],
                'name': 'refs'},
               {'name': 'one.txt'},
               {'contents': [{'name': 'two.txt'}], 'name': 'chapters'}],
  'name': 'base'}]

答案 1 :(得分:0)

省略冗余的name代码,您可以继续:

import json

result = {}

records = ["base/images/graphs/one.png", "base/images/tikz/two.png",
        "base/refs/images/three.png", "base/one.txt", "base/chapters/two.txt"]

recordsSplit = map(lambda x: x.split("/"), records)

for record in recordsSplit:
    here = result
    for item in record[:-1]:
        if not item in here:
            here[item] = {}
        here = here[item]
    if "###content###" not in here:
        here["###content###"] = []
    here["###content###"].append(record[-1])

print json.dumps(result, indent=4)

#字符用于唯一性(层次结构中可能存在名称为content的文件夹)。只需运行它,看看结果。

编辑:修正了一些拼写错误,添加了输出。