我有一个类似python的列表
df.groupBy("geohash").agg(collect_list("timehash")).alias("timehash").show
//output
+-------+--------+
|geohash|timehash|
+-------+--------+
| x| [y, z]|
| z| [y]|
+-------+--------+
并希望像这样生成它
[{'month': 8, 'total': 31600.0}, {'month': 9, 'total': 2000.0}]
为此,我正在[
{'month': 1, 'total': 0},
{'month': 2, 'total': 0},
...
{'month': 8, 'total': 31600},
{'month': 9, 'total': 2000},
...
{'month': 12, 'total': 0}
]
范围内运行迭代
(1,13)
如何检查 i 在 month 中是否存在并获取字典?
答案 0 :(得分:6)
您可以将字典列表转换为直接月份:
monthly_totals = {item['month']: item['total'] for item in data_list}
并使用带有dict.get
的简单列表理解来处理缺失值:
new_list = [{'month': i, 'total': monthly_totals.get(i, 0)} for i in range(1, 13)]
答案 1 :(得分:1)
创建一个包含默认值的新列表,然后从原始列表中更新所需的值
>>> lst = [{'month': 8, 'total': 31600.0}, {'month': 9, 'total': 2000.0}]
>>> new_lst = [dict(month=i, total=0) for i in range(1,13)]
>>> for d in lst:
... new_lst[d['month']-1] = d
...
>>> pprint(new_lst)
[{'month': 1, 'total': 0},
{'month': 2, 'total': 0},
{'month': 3, 'total': 0},
{'month': 4, 'total': 0},
{'month': 5, 'total': 0},
{'month': 6, 'total': 0},
{'month': 7, 'total': 0},
{'month': 8, 'total': 31600.0},
{'month': 9, 'total': 2000.0},
{'month': 11, 'total': 0},
{'month': 12, 'total': 0}]
答案 2 :(得分:0)
exist_lst = [{'month': 8, 'total': 31600.0}, {'month': 9, 'total': 2000.0}]
new_lst = []
for i in range(1,13):
found = False
for dict_item in exist_lst:
if dict_item['month'] == i:
new_lst.append(dict_item)
found = True
if not found:
new_lst.append({'month': i, 'total': 0}) # default_dict_item
print(new_lst)