我需要帮助基于词典列表内的对象合并某些词典。这可能吗?
我的数据:
mongo_data = [{
'url': 'https://goodreads.com/',
'variables': [{'key': 'Harry Potter', 'value': '10.0'},
{'key': 'Discovery of Witches', 'value': '8.5'},],
'vendor': 'Fantasy'
},{
'url': 'https://goodreads.com/',
'variables': [{'key': 'Hunger Games', 'value': '10.0'},
{'key': 'Maze Runner', 'value': '5.5'},],
'vendor': 'Dystopia'
},{
'url': 'https://kindle.com/',
'variables': [{'key': 'Twilight', 'value': '5.9'},
{'key': 'Lord of the Rings', 'value': '9.0'},],
'vendor': 'Fantasy'
},{
'url': 'https://kindle.com/',
'variables': [{'key': 'The Handmaids Tale', 'value': '10.0'},
{'key': 'Divergent', 'value': '9.0'},],
'vendor': 'Fantasy'
}]
我的代码:
我使用[groupby
]将具有相同URL的项目分组在一起。
from itertools import groupby, chain
import json
searches = []
for key, group in groupby(mongo_data, key=lambda chunk: chunk['url']):
search = {}
search["url"] = key
search["results"] = [{"genre": result["vendor"], "data": result["variables"]} for result in group]
searches.append(search)
print(json.dumps(searches))
我的输出
[
{
"url": "https://goodreads.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "Harry Potter",
"value": "10.0"
},
{
"key": "Discovery of Witches",
"value": "8.5"
}
]
},
{
"genre": "Dystopia",
"data": [
{
"key": "Hunger Games",
"value": "10.0"
},
{
"key": "Maze Runner",
"value": "5.5"
}
]
}
]
},
{
"url": "https://kindle.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "Twilight",
"value": "5.9"
},
{
"key": "Lord of the Rings",
"value": "9.0"
}
]
},
{
"genre": "Fantasy",
"data": [
{
"key": "The Handmaids Tale",
"value": "10.0"
},
{
"key": "Divergent",
"value": "9.0"
}
]
}
]
}
]
正如您在https://kindle.com/
下看到的,我两次拥有"genre":"Fantasy"
。而不是打印两次。我可以合并没有重复的内容吗?
所以我希望我的预期结果是:
{
"url": "https://kindle.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "Twilight",
"value": "5.9"
},
{
"key": "Lord of the Rings",
"value": "9.0"
},
{
"key": "The Handmaids Tale",
"value": "10.0"
},
{
"key": "Divergent",
"value": "9.0"
}
]
}
]
}
]
这可能吗?
答案 0 :(得分:1)
您需要第二个 groupby 来按供应商对结果进行分组。
例如:
searches = []
for key, group in groupby(mongo_data, key=lambda chunk: chunk['url']):
search = {"url": key, "results": []}
for vendor, group2 in groupby(group, key=lambda chunk2: chunk2['vendor']):
result = {
"genre": vendor,
"data": [{"key": key, "value": value}
for result2 in group2
for key, value in result2["variables"]],
}
search["results"].append(result)
searches.append(search)
理解列表用于展平result2["variables"]
并避免使用列表列表。
结果是:
[
{
"url": "https://goodreads.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "key",
"value": "value"
},
{
"key": "key",
"value": "value"
}
]
},
{
"genre": "Dystopia",
"data": [
{
"key": "key",
"value": "value"
},
{
"key": "key",
"value": "value"
}
]
}
]
},
{
"url": "https://kindle.com/",
"results": [
{
"genre": "Fantasy",
"data": [
{
"key": "key",
"value": "value"
},
{
"key": "key",
"value": "value"
},
{
"key": "key",
"value": "value"
},
{
"key": "key",
"value": "value"
}
]
}
]
}
]
答案 1 :(得分:0)
您可以在for
循环之后使用此代码来完成您提到的内容:
from collections import defaultdict
for item in searches:
results = item['results']
_res = defaultdict(list)
for r in results:
_res[r['genre']].append(r['data'])
item['data'] = [{
'genre': k,
'data': _res[k]
} for k in _res.keys()]
答案 2 :(得分:0)
如果您要“单行”(?),请尝试以下操作:
{"url": "https://kindle.com/", "results": [{"genre": k,"data": [v]} for k, v in {g:[y for x in [x['variables'] for x in mongo_data if x['vendor'] == g] for y in x] for g in set(x['vendor'] for x in mongo_data)}.items()]}
它产生
{
'url': 'https://kindle.com/',
'results': [
{
'genre': 'Fantasy',
'data': [
[
{'key': 'Harry Potter', 'value': '10.0'},
{'key': 'Discovery of Witches', 'value': '8.5'},
{'key': 'Twilight', 'value': '5.9'},
{'key': 'Lord of the Rings', 'value': '9.0'},
{'key': 'The Handmaids Tale', 'value': '10.0'},
{'key': 'Divergent', 'value': '9.0'}
]
]
},
{
'genre': 'Dystopia',
'data': [
[
{'key': 'Hunger Games', 'value': '10.0'},
{'key': 'Maze Runner', 'value': '5.5'}
]
]
}
]
}