这可能是一个已知的限制,但是当该系列包含NaT时,我很难计算Pandas中一系列的累积最小值。有没有办法使这项工作?
以下简单示例:
import pandas as pd
s = pd.Series(pd.date_range('2008-09-15', periods=10, freq='m'))
s.loc[10] = pd.NaT
s.cummin()
ValueError: Could not convert object to NumPy datetime
答案 0 :(得分:1)
Pandas 0.15.2中的bug has been fixed(待发布)。
作为一种解决方法,您可以使用skipna=False
,并手动处理NaT"":
import pandas as pd
import numpy as np
np.random.seed(1)
s = pd.Series(pd.date_range('2008-09-15', periods=10, freq='m'))
s.loc[10] = pd.NaT
np.random.shuffle(s)
print(s)
# 0 2008-11-30
# 1 2008-12-31
# 2 2009-01-31
# 3 2009-06-30
# 4 2008-10-31
# 5 2009-03-31
# 6 2008-09-30
# 7 2009-04-30
# 8 NaT
# 9 2009-05-31
# 10 2009-02-28
# dtype: datetime64[ns]
mask = pd.isnull(s)
result = s.cummin(skipna=False)
result.loc[mask] = pd.NaT
print(result)
产量
0 2008-11-30
1 2008-11-30
2 2008-11-30
3 2008-11-30
4 2008-10-31
5 2008-10-31
6 2008-09-30
7 2008-09-30
8 NaT
9 2008-09-30
10 2008-09-30
dtype: datetime64[ns]