我想在那些不存在的集群名称中填充 0。 与预期的输出一样,我在最后一行添加了 0,因为我没有在数据框中找到任何结果。 输入:
到目前为止我已经尝试过
#I have made clusters according to the requirement and making sum of it
# output of this code is given above
d_inv = {x: k for k, v in dict1.items() for x in v}
df = df['PII Count'].groupby(df['PII'].map(d_inv)).sum() \
.rename_axis('Cluster names') \
.reset_index(name='Total count')
print(df)
答案 0 :(得分:1)
如果顺序无关紧要,请使用重新索引并使用 dict1
中的键:
(df['PII Count'].groupby(df['PII'].map(d_inv)).sum().rename_axis('Cluster names')
.reindex(dict1.keys(),fill_value=0)
.reset_index(name='Total count'))
Cluster names Total count
0 Personal Info 270
1 Health Info 0
2 Network Info 94
3 Others Info 59
4 Finance Info 1
如果订单很重要:
m = df['PII'].map(d_inv)
out = df['PII Count'].groupby(m).sum()
out = (out.reindex(out.index.union(set(dict1.keys()).difference(m),sort=False),
fill_value=0)
.rename_axis('Cluster names')
.reset_index(name='Total count'))
print(out)
Cluster names Total count
0 Finance Info 1
1 Network Info 94
2 Others Info 59
3 Personal Info 270
4 Health Info 0