我有一个这种格式的JSON文件:
[
{"itemId": "1", "score": 0.2, "userId": "1", "rank": 1},
{"itemId": "3", "score": 0.1, "userId": "1", "rank": 2},
{"itemId": "12", "score": 0.6, "userId": "2", "rank": 1},
{"itemId": "21", "score": 0.2, "userId": "2", "rank": 2},
...
]
我希望用userId对它进行排序:
{
{
"userId": "1",
"items": [
{"itemId": "1", "score": 0.2, "rank": 1},
{"itemId": "3", "score": 0.1, "rank": 2},
...
]
},
{
"userId": "2",
"items": [
{"itemId": "12", "score": 0.6, "rank": 1},
{"itemId": "21", "score": 0.2, "rank": 2}
]
},
...
}
我尝试用Python自己做,但是我收到一条错误,上面写着:" TypeError:unhashable type:' dict'"。
你知道怎么做吗?
谢谢!
答案 0 :(得分:0)
累计defaultdict(list)
中的项目,重新输入用户ID:
from collections import defaultdict
data = [
{"itemId": "1", "score": 0.2, "userId": "1", "rank": 1},
{"itemId": "3", "score": 0.1, "userId": "1", "rank": 2},
{"itemId": "12", "score": 0.6, "userId": "2", "rank": 1},
{"itemId": "21", "score": 0.2, "userId": "2", "rank": 2},
]
output = defaultdict(list)
for dict_ in data:
userId = dict_.pop('userId')
output[int(userId)].append(dict_)
new_data = [{'userId': str(k), 'items': output[k]} for k in sorted(output)]
答案 1 :(得分:0)
您可以使用pandas
加载,然后使用groupby
和sort
。然后以你想要的方式写回json。见下文:
data = '[{"itemId": "1", "score": 0.2, "userId": "1", "rank": 1}, {"itemId": "12", "score": 0.6, "userId": "2", "rank": 1}, {"itemId": "3", "score": 0.1, "userId": "1", "rank": 2}, {"itemId": "21", "score": 0.2, "userId": "2", "rank": 2}]'
import pandas as pd
import collections
# read the json file to pandas
df = pd.read_json(data, dtype = {"itemId":object, "score": object, "userId": object, "rank":int})
# group by user id and sort them
g = df.groupby(['userId'],sort=True)
mylist = []
for k in g.groups.keys():
# create a temp dict holder
temp_dict = collections.OrderedDict()
#populate teh temp dict
temp_dict['userId'] = k
temp_dict['items'] = g['itemId','rank', 'score'].get_group(k).to_dict(orient='records')
# add the temp dict to the list
mylist.append(temp_dict)
# print as json
import json
print json.dumps(mylist,indent=4)
这将导致
[
{
"userId": "1",
"items": [
{
"itemId": "1",
"score": 0.2,
"rank": 1
},
{
"itemId": "3",
"score": 0.1,
"rank": 2
}
]
},
{
"userId": "2",
"items": [
{
"itemId": "12",
"score": 0.6000000000000001,
"rank": 1
},
{
"itemId": "21",
"score": 0.2,
"rank": 2
}
]
}
]