Question

groupby对象的长度和groupby对象的indices方法的长度有什么区别？我希望两个语句都返回相同的数字。

len(Fees.groupby(['InstituteCode','Code','ProgramType','Status','AcademicYear']))
8000

为什么我会得到不同的数字？

len(Fees.groupby(['InstituteCode','Code','ProgramType','Status','AcademicYear']).indices)
7433

这是否意味着我对给定的列列表只有7433个不同的记录？

Answer 1

This was because the "Code" column was null for 568 records. Those were skipped in groupby. It became clear when I checked for null values using...

df.apply(lambda x: x.isnull().sum())

groupby及其指数的差异

1 个答案: