Question

我有一个DF，如下：

Date Bought | Fruit
2018-01       Apple
2018-02       Orange
2018-02       Orange
2018-02       Lemon

我希望按“购买日期”和“水果”对数据进行分组，并对出现的次数进行计数。

预期结果：

Date Bought | Fruit | Count
2018-01       Apple     1
2018-02       Orange    2
2018-02       Lemon     1

我得到的：

Date Bought | Fruit | Count
2018-01       Apple     1
2018-02       Orange    2
              Lemon     1

使用的代码：

Initial attempt:
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count')

#2
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count').reset_index()
ERROR: Cannot insert Fruit, already exists

#3
df.groupby(['Date Bought','Fruit'])['Fruit'].agg('count').reset_index(inplace=True)
ERROR: Type Error: Cannot reset_index inplace on a Series to create a DataFrame

Documentation显示groupby函数返回的不是标准DF的“ groupby对象”。如何如上所述将数据分组并保留DF格式？

Answer 1

这里的问题是，通过重置索引，您将得到2个同名列。因为可以使用Series在Series.reset_index中设置参数name：

df1 = (df.groupby(['Date Bought','Fruit'], sort=False)['Fruit']
         .agg('count')
         .reset_index(name='Count'))
print (df1)
  Date Bought   Fruit  Count
0     2018-01   Apple      1
1     2018-02  Orange      2
2     2018-02   Lemon      1

熊猫-GroupBy 2列-无法重置索引

1 个答案: