用timedelta&计算时间范围。布尔

时间:2016-02-11 18:31:05

标签: python numpy pandas dataframe timedelta

我需要帮助做一个timedelta函数来确定actn_dt是否大于或等于1年前,如果是,则返回经验。

dataframe f2如下所示:

           nm_emp_lst    actn_dt
14483   MACKENZIE         2015-03-22
132902  CAMPBELL          2015-04-19
124182  SJOSTROM          2015-03-22
103482  LAPLANTE          2014-11-30
45722   LEMAY             2014-11-30
169088  TAYLOR            2015-06-14
105355  HENDERSON         2015-11-01
105359  HENDERSON         2014-10-19
45394   PELLERIN          2015-07-12
119317  BOISSEAU          2015-07-12

应该是这样的:

           nm_emp_lst    actn_dt        Experienced
14483   MACKENZIE         2015-03-22   
132902  CAMPBELL          2015-04-19    
124182  SJOSTROM          2015-03-22
103482  LAPLANTE          2014-11-30    Experienced
45722   LEMAY             2014-11-30    Experienced
169088  TAYLOR            2015-06-14    
105355  HENDERSON         2015-11-01    
105359  HENDERSON         2014-10-19    Experienced
45394   PELLERIN          2015-07-12    
119317  BOISSEAU          2015-07-12

所以,任何等于或大于一年的东西。

制作一个功能:

year = timedelta(days=365)
today2 = datetime.datetime.strftime(datetime.datetime.now(),'%A_%B_%d_%Y_%H%M')

def year(row):
    if row['actn_dt'] >= today2 - year:
        return "Experienced"

然后lamdba功能:

f2['Experienced'] = f2.apply (lambda row: year (row),axis=1)    

由此,我收到错误:

  

TypeError :(“不支持的操作数类型 - :'str'和'function'”,u'occurred at index 14483')

我的dtypes是:

nm_emp_lst            object
actn_dt       datetime64[ns]

感谢任何帮助!

===更新===
在jezrael的帮助下,我能够找到解决方案。它可能是漫长的道路,但它的工作原理。首先,我必须创建一个新列,在今天的日期之前提供一年的数据。

f2['year1'] = datetime.datetime.now().date() - datetime.timedelta(days=365)

然后我不得不将'year1'从timedelta更改为datetime:

f2['year1'] = pd.to_datetime(f2['year1'], coerce=True)

从这里我使用了jezrael提供的编码。

f2.loc[f2['actn_dt'] <= f2['year1'], 'Experienced'] = "Experienced"

新结果是:

               nm_emp_lst    actn_dt      year1  Experienced
14483   MACKENZIE         2015-03-22 2015-02-12          NaN
132902  CAMPBELL          2015-04-19 2015-02-12          NaN
124182  SJOSTROM          2015-03-22 2015-02-12          NaN
103482  LAPLANTE          2014-11-30 2015-02-12  Experienced
45722   LEMAY             2014-11-30 2015-02-12  Experienced
169088  TAYLOR            2015-06-14 2015-02-12          NaN
105355  HENDERSON         2015-11-01 2015-02-12          NaN
105359  HENDERSON         2014-10-19 2015-02-12  Experienced
45394   PELLERIN          2015-07-12 2015-02-12          NaN
119317  BOISSEAU          2015-07-12 2015-02-12          NaN

这就像一个魅力!谢谢jezrael!

1 个答案:

答案 0 :(得分:1)

您可以使用loc - df中的第二行已更改以进行测试:

print df
       nm_emp_lst    actn_dt
14483   MACKENZIE 2015-03-22
132902   CAMPBELL 2018-04-19
124182   SJOSTROM 2015-03-22
103482   LAPLANTE 2014-11-30
45722       LEMAY 2014-11-30
169088     TAYLOR 2015-06-14
105355  HENDERSON 2015-11-01
105359  HENDERSON 2014-10-19
45394    PELLERIN 2015-07-12

print datetime.timedelta(days=365)
365 days, 0:00:00

print datetime.datetime.now().date()
2016-02-12

print datetime.datetime.now().date() - datetime.timedelta(days=365)
2015-02-12
print df['actn_dt'] <= datetime.datetime.now().date() - datetime.timedelta(days=365)
14483     False
132902    False
124182    False
103482     True
45722      True
169088    False
105355    False
105359     True
45394     False
119317    False
Name: actn_dt, dtype: bool

df.loc[df['actn_dt'] <= datetime.datetime.now().date() - datetime.timedelta(days=365) , 'Experienced'] = "Experienced"
print df
       nm_emp_lst    actn_dt  Experienced
14483   MACKENZIE 2015-03-22          NaN
132902   CAMPBELL 2015-04-19          NaN
124182   SJOSTROM 2015-03-22          NaN
103482   LAPLANTE 2014-11-30  Experienced
45722       LEMAY 2014-11-30  Experienced
169088     TAYLOR 2015-06-14          NaN
105355  HENDERSON 2015-11-01          NaN
105359  HENDERSON 2014-10-19  Experienced
45394    PELLERIN 2015-07-12          NaN
119317   BOISSEAU 2015-07-12          NaN