我有一张表格:
A1, B1, C1, (value)
A1, B1, C1, (value)
A1, B1, C2, (value)
A1, B2, C1, (value)
A1, B2, C1, (value)
A1, B2, C2, (value)
A1, B2, C2, (value)
A2, B1, C1, (value)
A2, B1, C1, (value)
A2, B1, C2, (value)
A2, B1, C2, (value)
A2, B2, C1, (value)
A2, B2, C1, (value)
A2, B2, C2, (value)
A2, B2, C2, (value)
我想在python中使用它作为字典,形式为:
H = {
'A1':{
'B1':{
'C1':[],'C2':[],'C3':[] },
'B2':{
'C1':[],'C2':[],'C3':[] },
'B3':{
'C1':[],'C2':[],'C3':[] }
},
'A2':{
'B1':{
'C1':[],'C2':[],'C3':[] },
'B2':{
'C1':[],'C2':[],'C3':[] },
'B3':{
'C1':[],'C2':[],'C3':[] }
}
}
这样H[A][B][C]
就会产生一个特定的唯一值列表。对于小字典,我可能只是如上所述预先定义结构,但我正在寻找一种有效的方法来迭代表并构建字典,而不是提前指定字典键。
答案 0 :(得分:9)
input = [('A1', 'B1', 'C1', 'Value'), (...)]
from collections import defaultdict
tree = defaultdict(lambda: defaultdict(lambda: defaultdict(list)))
#Alternatively you could use partial() rather than lambda:
#tree = defaultdict(partial(defaultdict, partial(defaultdict, list)))
for x, y, z, value in input:
tree[x][y][z].append(value)
答案 1 :(得分:4)
如果您只访问H [A] [B] [C](也就是说,从不H [A]或H [A] [B]),我建议使用IMO清洁解决方案:使用元组作为defaultdict指数:
from collections import defaultdict
h = defaultdict(list)
for a, b, c, value in input:
h[a, b, c].append(value)
答案 2 :(得分:2)
d = {}
for (a, b, c, value) in your_table_of_tuples:
d.setdefault(a, {}).setdefault(b,{}).setdefault(c,[]).append(value)
答案 3 :(得分:0)
但是,如果您没有三个级别,但可能只有十个级别,该怎么办?可以肯定的是,您可以使用while循环来做到这一点,但这就是我想出的(警告:对可变性和指针的过度利用)
def build_tree(data, categories):
"""Build a dependency tree based on a Pandas DataFrame and an
ordered list of levels.
Parameters
----------
data : pandas.core.frame.DataFrame
A DataFrame containing the table to derive tree from
categories : array_like
An ordered, sliceable list of column names to include in tree
Returns
-------
hierarchy : dict
A standard Python dictionary
"""
hierarchy = {}
def expand(data, categories, current_level):
if len(categories) == 2:
for value in data[categories[0]].unique():
current_level[value] = data.loc[data[categories[0]] == value, categories[1]]
else:
for value in data[categories[0]].unique():
current_level[value] = {e: None for e in data.loc[data[categories[0]] == value, categories[1]].unique()}
expand(data.loc[data[categories[0]] == value, :], categories[1:], current_level[value])
expand(data, categories, hierarchy)
return hierarchy
我觉得某些for循环可能会更好,但并非总是如此吗?
这还取决于在层次结构底部找到的单个键/值对。
答案 4 :(得分:0)
这将适用于任意数量的键:
def parse_table_to_tree(table):
tree = dict()
for row in table:
tree = attach_leaf(tree, row[:-1], row[-1])
return tree
def attach_leaf(tree:dict, keys:list, value):
d = tree
for i, key in enumerate(keys):
if i < len(keys)-1:
d = d.setdefault(key, {})
else:
d[key] = value
return tree
tree = parse_table_to_tree(
[
['a', 'b', 'c', []],
['a', 'b', 'd', []],
['a', 'f', 'a', []]
]
)