两个日期列之间的差异

时间:2019-10-18 13:24:47

标签: python

我必须找到python数据框中两个日期列之间的差异,并比较差异是否大于120

如果working_data ['CLAIMS_EVENT_DATE']-working_data ['LAST_LAPSED_DATE']> 120:

我低于错误

invalid_comparison     .format(dtype = left.dtype,typ = type(right)。名称))

TypeError:dtype = timedelta64 [ns]与int之间的无效比较

2 个答案:

答案 0 :(得分:0)

如果两个比较均获得timedelta,则可以比较2个解决方案-如果需要测试至少一个值是否符合条件,则将Series.dt.daysSeries.any的天数进行比较:

s = (working_data['CLAIMS_EVENT_DATE'] - working_data['LAST_LAPSED_DATE'])

if (s.dt.days > 120).any():
    print ('At least one value is higher')

或通过Timedelta进行比较:

if (s > pd.Timedelta(120, unit='d')).any():
    print ('At least one value is higher')

如果需要更合适的行,请使用boolean indexing

df = working_data[s.dt.days > 120]

或者:

df = working_data[s > pd.Timedelta(120, unit='d')]

答案 1 :(得分:0)

#Convert both columns to datetime format
working_data['CLAIMS_EVENT_DATE'] = pd.to_datetime(working_data['CLAIMS_EVENT_DATE'])
working_data['LAST_LAPSED_DATE'] = pd.to_datetime(working_data['LAST_LAPSED_DATE'])

#Calculate the difference between the days
working_data['Days'] = (working_data['LAST_LAPSED_DATE']                           
                        - working_data['CLAIMS_EVENT_DATE']).days

#Create a column 'Greater' and check whether difference is greater than 120 or not
working_data.loc[working_data.Days <= 120, 'Greater'] = 'False' 
working_data.loc[working_data.Days > 120, 'Greater'] = 'True'