熊猫-GroupBy 2列-无法重置索引

时间:2020-03-19 12:03:12

标签: python python-3.x pandas pandas-groupby

我有一个DF,如下:

Date Bought | Fruit
2018-01       Apple
2018-02       Orange
2018-02       Orange
2018-02       Lemon

我希望按“购买日期”和“水果”对数据进行分组,并对出现的次数进行计数。

预期结果:

Date Bought | Fruit | Count
2018-01       Apple     1
2018-02       Orange    2
2018-02       Lemon     1

我得到的:

Date Bought | Fruit | Count
2018-01       Apple     1
2018-02       Orange    2
              Lemon     1

使用的代码:

Initial attempt:
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count')

#2
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count').reset_index()
ERROR: Cannot insert Fruit, already exists

#3
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count').reset_index(inplace=True)
ERROR: Type Error: Cannot reset_index inplace on a Series to create a DataFrame

Documentation显示groupby函数返回的不是标准DF的“ groupby对象”。如何如上所述将数据分组并保留DF格式?

1 个答案:

答案 0 :(得分:7)

这里的问题是,通过重置索引,您将得到2个同名列。因为可以使用SeriesSeries.reset_index中设置参数name

df1 = (df.groupby(['Date Bought','Fruit'], sort=False)['Fruit']
         .agg('count')
         .reset_index(name='Count'))
print (df1)
  Date Bought   Fruit  Count
0     2018-01   Apple      1
1     2018-02  Orange      2
2     2018-02   Lemon      1