Semi Flatten一本字典

时间:2013-12-10 14:08:54

标签: python recursion dictionary

说我有这本词典:

"pools": {
        "JP": {
            "longName": "Jackpot",
            "poolTotal": 318400,
            "shortName": "Jpot",
            "sortOrder": 9
        }
    },

我如何输出这个,所以我有这样的:

pool_JP_longname: Jackpot
pool_JP_poolTotal: 318400
etc
etc

嵌套不应限制在2级或3级,因此它应该是通用的。

或另一个例子:

{
    "soccer": {
        "X07": {
            "date": "2013-11-22",
            "poolType": "S10",
            "code": "INT",
            "closeTime": "20:00:00",
            "poolStatus": "OP",
            "pool": {
                "1": {
                    "startRace": 1,
                    "matchs": {
                        "1": {
                            "teamA": "Ajax Cape Town",
                            "teamB": "Moroka Swallows",
                            "matchStatus": "OP"
                        },
                        "2": {
                            "teamA": "Bidvest Wits",
                            "teamB": "MP Black Aces",
                            "matchStatus": "OP"
                        }
                    }
                }
            }
        }
    }
}

看起来像这样:

soccer_X07_data: "2013-11-22"
soccer_X07_poolType: "S10"
etc
soccer_X07_pool_1_matchs_1_teamA
soccer_X07_pool_1_matchs_1_teamB
etc

我开始这样做了,但这不正确:

def iterTool(json_data, key_string):
    for root_key, item in sorted(json_data.items(), key=itemgetter(0)):
        if type(json_data[root_key]) == dict:
            key_string += "_%s" % root_key
            if json_data[root_key].keys():
                for parent_key in json_data[root_key]:
                    if type(json_data[root_key][parent_key]) in [unicode]:
                        print "%s_%s" % (key_string, parent_key)
                        # print key_string.split("_")
                        # pass
            iterTool(json_data[root_key], key_string)

这样就会出类拔萃:

_soccer_X07_code
_soccer_X07_poolStatus
_soccer_X07_closeTime
_soccer_X07_poolType
_soccer_X07_date
_soccer_X07_pool_1_matchs_1_matchStatus
_soccer_X07_pool_1_matchs_1_teamA
_soccer_X07_pool_1_matchs_1_teamB
_soccer_X07_pool_1_matchs_1_10_matchStatus
_soccer_X07_pool_1_matchs_1_10_teamA
_soccer_X07_pool_1_matchs_1_10_teamB
_soccer_X07_pool_1_matchs_1_10_2_matchStatus
_soccer_X07_pool_1_matchs_1_10_2_teamA
_soccer_X07_pool_1_matchs_1_10_2_teamB
_soccer_X07_pool_1_matchs_1_10_2_3_matchStatus
_soccer_X07_pool_1_matchs_1_10_2_3_teamA
_soccer_X07_pool_1_matchs_1_10_2_3_teamB
_soccer_X07_pool_1_matchs_1_10_2_3_4_matchStatus
_soccer_X07_pool_1_matchs_1_10_2_3_4_teamA
_soccer_X07_pool_1_matchs_1_10_2_3_4_teamB
_soccer_X07_pool_1_matchs_1_10_2_3_4_5_matchStatus
_soccer_X07_pool_1_matchs_1_10_2_3_4_5_teamA
...

现在只是另一个曲线球..

让dict看起来像这样:

{
    "CAP": {
        "countryName": "ZAF",
        "displayName": "AN"
    },
    "SPA": {
        "countryName": "AUs",
        "displayName": "AG"
    }
}

然后将它弄平而不是有意义:

GENERIC_KEY:CAP,countryName:ZAF,displayName:AN

你怎么会发现这个?

3 个答案:

答案 0 :(得分:2)

这会递归地压缩你的词典:

def flatten_dict(dct, output=None, prefix=None):
    if output is None:
        output = {}
    if prefix is None:
        prefix = []
    for key in dct:
        if isinstance(dct[key], dict):
            flatten_dict(dct[key], output, prefix + [key])
        else:
            output["_".join(prefix + [key])] = dct[key]
    return output

对于你的第二个例子,我得到:

{'soccer_X07_pool_1_matchs_2_teamA': 'Bidvest Wits', 
 'soccer_X07_pool_1_matchs_2_teamB': 'MP Black Aces',
 'soccer_X07_pool_1_matchs_1_matchStatus': 'OP', 
 ...}

答案 1 :(得分:1)

一个简单的解决方案可能如下所示:

d = { ... }

def flatten(dic, stack=None):
    if not stack: stack = []
    for key,value in dic.iteritems():
        new_stack = stack[:] + [key]
        if isinstance(value, dict): 
            for result in flatten(value, new_stack):
                yield result
        else: 
            yield new_stack, value

# just print it:           
for stack, value in flatten(d):
    print '{}: {}'.format('_'.join(stack), value)

# create a new dict:
new_d = {'_'.join(stack): value for stack, value in flatten(d)}

答案 2 :(得分:0)

这是一个基本的递归问题(或可以通过递归解决的问题):

这是一个满足你的第一个例子(或多或少)的人为例子:

d = {
    "pools": {
        "JP": {
            "longName": "Jackpot",
            "poolTotal": 318400,
            "shortName": "Jpot",
            "sortOrder": 9
        }
    }
}


def flattendict(d):
    for k, v in d.items():
        if isinstance(v, dict):
            for x in flattendict(v):
                yield "{}_{}".format(k, x)
        else:
            yield "{}_{}".format(k, v)


for item in flattendict(d):
    print item

NB:我为问题解决和调查留下了几个问题。