汇总嵌套字典条目

时间:2014-11-30 18:59:57

标签: python json dictionary nested itertools

我有一个JSON文件,我正在读作字典。我所拥有的是:

        "20101021": {
            "4x4": {
                "Central Spectrum": 5, 
                "Full Frame": 5, 
                "Custom": 1
            }, 
            "4x2": {
                "Central Spectrum": 5, 
                "Full Frame": 5
            }, 
            "1x1": {
                "Central Spectrum": 5, 
                "Full Frame": 4
            }, 
        }, 
        "20101004": {
            "4x4": {
                "Central Spectrum": 5, 
                "Full Frame": 5
            }, 
            "4x2": {
                "Central Spectrum": 5, 
                "Full Frame": 5
            }, 
            "1x1": {
                "Central Spectrum": 5, 
                "Full Frame": 5
            }

等等。 我正在尝试计算1x14x2(等)和Central Spectrum以及Full Frame的所有组合的总和(在所有日期),在这个例子中我想要的加起来5

我到目前为止(使用itertoolsCounter()):

bins = map("x".join, itertools.product('124', repeat=2))
rois = ['Full Frame', 'Central Spectrum']
types = itertools.product(bins, rois)
c = collections.Counter(dict)
for type in types:
    print "%s : %d" % (type, c[type])

这会打印出所有组合的精美列表,但无法对值进行任何实际求和。你能帮忙吗?

1 个答案:

答案 0 :(得分:2)

也许我误解了预期的最终结果,但你可能不需要反击......如果你知道你只有两个级别的嵌套<,那么一个简单的sum就足够了/ em>的

我们假设您将json字典词典加载到名为data的变量中。

然后你可以这样做:

results = {}
for key in data.keys():
    # key is '20101021', '20101004'...
    # data[key].keys() is '4x4, '4x2'... so let's make sure
    # that the result dictionary contains all those '4x4', '4x2'
    # being zero if nothing better can be calculated.
    results[key] = dict.fromkeys(data[key].keys(), 0)

    for sub_key in data[key].keys():
        # sub_key is '4x4', '4x2'...
        # Also, don't consider a 'valid value' someting that is not a
        # "Central Spectrum" or a "Full Frame"
        valid_values = [
            int(v) for k, v in data[key][sub_key].items()
            if k in ["Central Spectrum", "Full Frame"]
        ]
        # Now add the 'valid_values'
        results[key][sub_key] = sum(valid_values)
print results

哪个输出:

{
  u'20101021': {u'1x1': 9, u'4x4': 10, u'4x2': 10},
  u'20101004': {u'1x1': 10, u'4x4': 10, u'4x2': 10}
}

在很多情况下,我只使用dict.keys(),因为这可能会澄清这个过程? (好吧,一次dict.items())你也有dict.values()(并且所有树函数都有它们的迭代器等价物),这可能会缩短你的代码。另外,请参阅dict.fromkeys的内容。

编辑(根据OP对此答案的评论)

如果您希望随着时间的推移添加数据(或收集&#34;),那么您需要将results[key]从日期字符串(如上面的答案中所示)移动到{ {1}},1x1 ...

4x4

哪个输出:

VALID_KEYS = ["Central Spectrum", "Full Frame"]
results = {}
for key_1 in data.keys():
    # key_1 is '20101021', '20101004'...

    for key_2 in data[key_1].keys():
        # key_2 is '4x4', '4x2'...
        if key_2 not in results:
            results[key_2] = dict.fromkeys(VALID_KEYS, 0)
        for key_3 in data[key_1][key_2].keys():
            # key_3 is 'Central Spectrum', 'Full Frame', 'Custom'...
            if key_3 in VALID_KEYS:
                results[key_2][key_3] += data[key_1][key_2][key_3]
print results