我有2个包含印象数据和点击数据的字典列表。例如:
[{'offerId':'1650','position':'15','clicksCount':21},{'offerId': '2323','position':'12','clicksCount':14},{'offerId':'2323', 'position':'14','clicksCount':8},{'offerId':'1621','position': '10','clicksCount':7}] ...
[{'offerId':'3207','position':'9','impressionsCount':866}, {'offerId':'1650','position':'6','impressionsCount':896}, {'offerId':'3207','position':'1','impressionsCount':909}, {'offerId':'2323','position':'12'}] ...
我需要将其合并在一起,并按offerId和位置进行合并,以获取每个要约位置的结果(点击和展示)。
我尝试过这段代码,但是返回了错误的结果:
d = defaultdict(dict)
for l in (clicks_aggregated_data, impressions_aggregated_data):
for elem in l:
d[elem['offerId']].update(elem)
d[elem['position']].update(elem)
combined_data = list(d.values())
for model, group in groupby(combined_data, key=lambda x:x['offerId']):
print(list(group))
有人可以帮我达到像桌子一样的效果(截图)吗?
答案 0 :(得分:1)
您可以尝试从impressions_aggregated_data
创建查找字典,然后进行合并。
例如:
impressions_aggregated_data_lookup = {"{}_{}".format(i["offerId"], i["position"]) : i["impressionsCount"] for i in impressions_aggregated_data}
for i in clicks_aggregated_data:
if "{}_{}".format(i["offerId"], i["position"]) in impressions_aggregated_data_lookup:
i.update({"impressionsCount": impressions_aggregated_data_lookup["{}_{}".format(i["offerId"], i["position"])]})
pprint(clicks_aggregated_data)
答案 1 :(得分:0)
我希望这是您想要做的。使用两个字典创建pandas dataframe
,然后找到clicks
和impressions
的总和。请参见下面的样机。让我知道它是否有效。
import pandas as pd
d1=[{'offerId': '1650', 'position': '15', 'clicksCount': 21},
{'offerId': '2323', 'position': '12', 'clicksCount': 14},
{'offerId': '2323', 'position': '14', 'clicksCount': 8},
{'offerId': '1621', 'position': '10', 'clicksCount': 7}]
d2=[{'offerId': '3207', 'position': '9', 'impressionsCount': 866},
{'offerId': '1650', 'position': '6', 'impressionsCount': 896},
{'offerId': '3207', 'position': '1', 'impressionsCount': 909},
{'offerId': '2323', 'position': '12'}]
combdf=df1.append([pd.DataFrame(d1), pd.DataFrame(d2)],sort=False)
combdf.groupby(['offerId', 'position']).sum()[["clicksCount", "impressionsCount"]].reset_index()
以下结果:
offerId position clicksCount impressionsCount
0 1621 10 14.0 0.0
1 1650 15 42.0 0.0
2 1650 6 0.0 896.0
3 2323 12 28.0 0.0
4 2323 14 16.0 0.0
5 3207 1 0.0 909.0
6 3207 9 0.0 866.0