我有一个带有2个日期列的pandas DataFrame:已创建和已解决。
Created Resolved
0 2019-09-24 20:48:25 2019-10-31 22:07:36
1 2019-09-27 00:54:39 2019-11-18 17:24:13
2 2019-09-27 20:07:50 NaT
3 2019-09-27 20:17:10 2019-10-22 17:34:08
4 2019-09-27 22:01:29 2019-10-22 17:34:08
5 2019-09-30 17:41:02 NaT
6 2019-10-02 04:36:32 NaT
7 2019-10-03 17:42:15 2019-10-22 17:34:09
8 2019-10-03 18:34:29 NaT
10 2019-10-08 18:40:45 2019-10-22 17:34:09
这是信息。
class 'pandas.core.frame.DataFrame'>
Int64Index: 10 entries, 0 to 10
Data columns (total 2 columns):
Created 10 non-null datetime64[ns]
Resolved 6 non-null datetime64[ns]
dtypes: datetime64[ns](2)
我想检查解决问题需要多少天:
@numpy.vectorize
def age(res, cr):
if pd.isnull(res):
return pd.to_datetime('today')-cr
return res-cr
当我将列传递给函数df.diff=age(df.Resolved,df.Created)
时,会出现错误:
ValueError: Cannot add integral value to Timestamp without freq.
答案 0 :(得分:0)
我认为您需要用Series.fillna
替换Resolved
列中的缺失值,然后减去列Created
,解决方案已经向量化:
df['diff'] = df.Resolved.fillna(pd.to_datetime('today')) - df.Created
print (df)
Created Resolved diff
0 2019-09-24 20:48:25 2019-10-31 22:07:36 37 days 01:19:11
1 2019-09-27 00:54:39 2019-11-18 17:24:13 52 days 16:29:34
2 2019-09-27 20:07:50 NaT 159 days 13:42:59.435281
3 2019-09-27 20:17:10 2019-10-22 17:34:08 24 days 21:16:58
4 2019-09-27 22:01:29 2019-10-22 17:34:08 24 days 19:32:39
5 2019-09-30 17:41:02 NaT 156 days 16:09:47.435281
6 2019-10-02 04:36:32 NaT 155 days 05:14:17.435281
7 2019-10-03 17:42:15 2019-10-22 17:34:09 18 days 23:51:54
8 2019-10-03 18:34:29 NaT 153 days 15:16:20.435281
10 2019-10-08 18:40:45 2019-10-22 17:34:09 13 days 22:53:24