无法按一个月对“熊猫”列中的值进行分组

时间:2019-06-22 10:41:46

标签: python pandas pandas-groupby

我想计算时间日志中按月分组的实例数。我有以下“熊猫”专栏:

print df['date_unconditional'][:5]

0    2018-10-15T07:00:00
1    2018-06-12T07:00:00
2    2018-08-28T07:00:00
3    2018-08-29T07:00:00
4    2018-10-29T07:00:00
Name: date_unconditional, dtype: object

然后我将其转换为日期时间格式

df['date_unconditional'] = pd.to_datetime(df['date_unconditional'].dt.strftime('%m/%d/%Y'))
print df['date_unconditional'][:5]


0   2018-10-15
1   2018-06-12
2   2018-08-28
3   2018-08-29
4   2018-10-29
Name: date_unconditional, dtype: datetime64[ns]

然后我尝试对它们进行计数,但我一直犯错

df['date_unconditional'] = pd.to_datetime(df['date_unconditional'], errors='coerce')
print df['date_unconditional'].groupby(pd.Grouper(freq='M')).count()

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'

该格式不是RangeIndex,我尝试以其他方式更改它,但是此错误不断弹出。

1 个答案:

答案 0 :(得分:0)

Grouper中使用参数key

df['date_unconditional'] = pd.to_datetime(df['date_unconditional'], errors='coerce')
print (df.groupby(pd.Grouper(freq='M',key='date_unconditional'))['date_unconditional'].count())
2018-06-30    1
2018-07-31    0
2018-08-31    2
2018-09-30    0
2018-10-31    2
Freq: M, Name: date_unconditional, dtype: int64

或通过DataFrame.set_index创建DatetimeIndex,然后可以使用GroupBy.size-两者之间的区别是count排除了缺失值,size不是。

df['date_unconditional'] = pd.to_datetime(df['date_unconditional'], errors='coerce')
print (df.set_index('date_unconditional').groupby(pd.Grouper(freq='M')).size())
2018-06-30    1
2018-07-31    0
2018-08-31    2
2018-09-30    0
2018-10-31    2
Freq: M, dtype: int64