每个字段转换JSON文件

时间:2017-04-28 17:17:10

标签: python json

我有一个这种格式的JSON文件:

[
    {"itemId": "1", "score": 0.2, "userId": "1", "rank": 1}, 
    {"itemId": "3", "score": 0.1, "userId": "1", "rank": 2}, 
    {"itemId": "12", "score": 0.6, "userId": "2", "rank": 1}, 
    {"itemId": "21", "score": 0.2, "userId": "2", "rank": 2}, 
    ...
]

我希望用userId对它进行排序:

{
    {
        "userId": "1",
        "items": [
            {"itemId": "1", "score": 0.2, "rank": 1},
            {"itemId": "3", "score": 0.1, "rank": 2},
            ...
        ]
    },

    {
        "userId": "2",
        "items": [
            {"itemId": "12", "score": 0.6, "rank": 1},
            {"itemId": "21", "score": 0.2, "rank": 2}
        ]
    },
    ...
}

我尝试用Python自己做,但是我收到一条错误,上面写着:" TypeError:unhashable type:' dict'"。

你知道怎么做吗?

谢谢!

2 个答案:

答案 0 :(得分:0)

累计defaultdict(list)中的项目,重新输入用户ID:

from collections import defaultdict

data = [
    {"itemId": "1", "score": 0.2, "userId": "1", "rank": 1}, 
    {"itemId": "3", "score": 0.1, "userId": "1", "rank": 2}, 
    {"itemId": "12", "score": 0.6, "userId": "2", "rank": 1}, 
    {"itemId": "21", "score": 0.2, "userId": "2", "rank": 2}, 
]

output = defaultdict(list)
for dict_ in data:
    userId = dict_.pop('userId')
    output[int(userId)].append(dict_)

new_data = [{'userId': str(k), 'items': output[k]} for k in sorted(output)]

答案 1 :(得分:0)

您可以使用pandas加载,然后使用groupbysort。然后以你想要的方式写回json。见下文:

data = '[{"itemId": "1", "score": 0.2, "userId": "1", "rank": 1}, {"itemId": "12", "score": 0.6, "userId": "2", "rank": 1},  {"itemId": "3", "score": 0.1, "userId": "1", "rank": 2},  {"itemId": "21", "score": 0.2, "userId": "2", "rank": 2}]'


import pandas as pd
import collections

# read the json file to pandas
df = pd.read_json(data, dtype = {"itemId":object, "score": object, "userId": object, "rank":int})
# group by user id and sort them
g = df.groupby(['userId'],sort=True)

mylist = []
for k in g.groups.keys():
    # create a temp dict holder
    temp_dict = collections.OrderedDict()
    #populate teh temp dict
    temp_dict['userId'] = k
    temp_dict['items'] = g['itemId','rank', 'score'].get_group(k).to_dict(orient='records')
    # add the temp dict to the list
    mylist.append(temp_dict)

# print as json
import json
print json.dumps(mylist,indent=4)

这将导致

[
    {
        "userId": "1", 
        "items": [
            {
                "itemId": "1", 
                "score": 0.2, 
                "rank": 1
            }, 
            {
                "itemId": "3", 
                "score": 0.1, 
                "rank": 2
            }
        ]
    }, 
    {
        "userId": "2", 
        "items": [
            {
                "itemId": "12", 
                "score": 0.6000000000000001, 
                "rank": 1
            }, 
            {
                "itemId": "21", 
                "score": 0.2, 
                "rank": 2
            }
        ]
    }
]