如何从datetime64数组中减去矢量化形式的浮点值?
数据:
import numpy as np
import pandas as pd
some_dates = np.array(['2007-07-13', '2006-01-13', '2010-08-13'], dtype='datetime64')
some_ints = np.array([1 ,2 ,3], dtype = 'int64')
some_float = np.array([1.00 ,2.00 ,3.00], dtype = 'float64')
data_dict = {'dates':some_dates,
'ints':some_ints,
'floats':some_float}
test_data = pd.DataFrame(data_dict)
看起来像这样:
Out[1]:
dates floats ints
0 2007-07-13 1 1
1 2006-01-13 2 2
2 2010-08-13 3 3
我想做什么:
#===============================================================================
# Works well
#===============================================================================
test_data['dates'] = test_data['dates'].sub(test_data['ints'])
但是在矢量中使用NaN值。不支持int向量中的Nan,因此它们会自动转换为float:
#------------------------------------------------------------------------------
# Converts ints to floats
test_data.dtypes
> Out[2]:
> dates datetime64[ns]
> floats float64
> ints int64
> dtype: object
test_data.loc[2:2, 'ints'] = None
> Out[3]:
> dates datetime64[ns]
> floats float64
> ints float64
> dtype: object
> Out[4]:
> dates floats ints
> 0 2007-07-13 1 1
> 1 2006-01-13 2 2
> 2 2010-08-13 3 NaN
但是我不能从我的约会中减去花车:
#----------------------------------------------------------------------------- #
# But this way also doesn't work
test_data['dates'] = test_data['dates'].sub(test_data['floats'])
> TypeError: ufunc subtract cannot use operands with types dtype('<M8[ns]') and dtype('float64')
由于“in python”申请,我找到了极慢的解决方法:
# from dateutil.relativedelta import relativedelta
def sub_float(df_row):
if pd.notnull(df_row['floats']):
# df_row['dates'] = df_row['dates'] - relativedelta(days = df_row['floats'])
df_row['dates'] = df_row['dates'] - pd.DateOffset(days=df_row['floats'])
return(df_row['dates'])
test_data['dates'] = test_data.apply(sub_float, 1)
有什么建议我如何以矢量化的方式从日期时间中减去浮点数?
答案 0 :(得分:4)
将浮点数更改为time_deltas(能够处理NaN)
In [22]: df
Out[22]:
dates floats ints
0 2007-07-13 NaN 1
1 2006-01-13 2 2
2 2010-08-13 3 3
In [23]: df.dates - pd.to_timedelta(df.floats.astype(str), unit='D')
Out[23]:
0 NaT
1 2006-01-11
2 2010-08-10
dtype: datetime64[ns]