Question

我有一个单列数据帧，其整数索引表示为字符串，其中包含重复值。这些值是整数，我希望有一个数据框，其索引中没有重复，其值是最初具有给定索引标签的所有值的总和。以下是我正在使用的数据样本：

我可以这样做，但它似乎不是很好的熊猫语法：

new_index = set(verts.index)
new_vals = [verts[x].sum() for x in new_index]
new_df = pd.DataFrame({'Counts': new_vals}, index=new_index)
new_df
   Counts
1       3
0      42
3      88
2      37
5      30
4      51
6       2

还有更直接的东西吗？感谢。

Answer 1

尝试重置您的索引，然后使用groupby：

verts = pd.Series([54, 34, 33, 28, 23, 22, 15, 15, 15, 9, 2, 1, 1, 1], 
                  index=["3", "3", "0", "4", "4", "2", "2", "5", "5", "0", "1", "6", "1", "6"])

>>> verts.reset_index().groupby('index').sum()
        0
index    
0      42
1       3
2      37
3      88
4      51
5      30
6       2

或指定level=0对索引进行分组。

verts.groupby(level=0).sum()

根据重复的索引标签组合数据帧值

1 个答案: