大熊猫列操作与日期

时间:2015-05-23 12:02:13

标签: python date pandas

我有一个数据框,主要是日期。这就是我想要做的事情

从旧日期变量(DTDate),我想创建一个新的日期变量,如果旧日期是星期一,新日期将相同,但如果旧日期是除星期一以外的任何日期,则新日期我会告诉我下周一的日期。所以最后新日期的所有项目都只在星期一。

我一直在尝试使用功能并申请。这是我的数据集和代码

    Date call   DTDate      weekday     weekdayNo
0   31/12/2014  2014-12-31  Wednesday   3
1   29/10/2014  2014-10-29  Wednesday   3
2   28/10/2014  2014-10-28  Tuesday     2
3   27/3/2015   2015-03-27  Friday      5
4   27/2/2015   2015-02-27  Friday      5
5   27/11/2014  2014-11-27  Thursday    4
6   27/10/2014  2014-10-27  Monday      1
7   26/3/2015   2015-03-26  Thursday    4
8   26/2/2015   2015-02-26  Thursday    4
9   26/12/2014  2014-12-26  Friday      5
10  26/11/2014  2014-11-26  Wednesday   3
11  26/10/2014  2014-10-26  Sunday      0
12  25/3/2015   2015-03-25  Wednesday   3
13  25/12/2014  2014-12-25  Thursday    4
14  24/3/2015   2015-03-24  Tuesday     2
15  24/2/2015   2015-02-24  Tuesday     2
16  24/12/2014  2014-12-24  Wednesday   3
17  24/11/2014  2014-11-24  Monday      1
18  23/3/2015   2015-03-23  Monday      1

代码是

from datetime import date, timedelta

def AddDate(row):
    if row['weekdayNo']==0:
        return row['DTDate'] + timedelta(days=1)
    elif row['weekdayNo'] ==2:
        return row['DTDate'] + timedelta(days=6)
    elif row['weekdayNo'] ==3:
       return row['DTDate'] + timedelta(days=5)
    elif row['weekdayNo'] ==4:
       return row['DTDate'] + timedelta(days=4)
    elif row['weekdayNo'] ==5:
       return row['DTDate'] + timedelta(days=3) 
    elif row['weekdayNo'] ==6:
       return row['DTDate'] + timedelta(days=2)
    else:
       return row['DTDate']

 DF['newDate'] = DF.apply(AddDate, axis=1)

我得到以下内容,它完全相同,没有任何改变

     Date call  DTDate       weekday    weekdayNo   newDate
 0  31/12/2014  2014-12-31  Wednesday      3        2014-12-31
 1  29/10/2014  2014-10-29  Wednesday      3        2014-10-29
 2  28/10/2014  2014-10-28  Tuesday        2        2014-10-28
 3  27/3/2015   2015-03-27  Friday         5        2015-03-27
 4  27/2/2015   2015-02-27  Friday         5        2015-02-27
 5  27/11/2014  2014-11-27  Thursday       4        2014-11-27
 6  27/10/2014  2014-10-27  Monday         1        2014-10-27
 7  26/3/2015   2015-03-26  Thursday       4        2015-03-26
 8  26/2/2015   2015-02-26  Thursday       4        2015-02-26
 9  26/12/2014  2014-12-26  Friday         5        2014-12-26
 10 26/11/2014  2014-11-26  Wednesday      3        2014-11-26
 11 26/10/2014  2014-10-26  Sunday         0        2014-10-26
 12 25/3/2015   2015-03-25  Wednesday      3        2015-03-25
 13 25/12/2014  2014-12-25  Thursday       4        2014-12-25
 14 24/3/2015   2015-03-24  Tuesday        2        2015-03-24
 15 24/2/2015   2015-02-24  Tuesday        2        2015-02-24
 16 24/12/2014  2014-12-24  Wednesday      3        2014-12-24
 17 24/11/2014  2014-11-24  Monday         1        2014-11-24
 18 23/3/2015   2015-03-23  Monday         1        2015-03-23

我也认为,这个想法并不好,如果有更好的东西,请有人愿意建议,那可能是什么?提前致谢

3 个答案:

答案 0 :(得分:2)

您不需要import datetimetimedelta来执行此操作。

df['DTDate'] = pd.to_datetime(df['DTDate'])  # can skip this if column 'DTDate' is already of the right type

x.weekday()以星期一= 0和星期日= 6提取星期几。

df['newDate'] = df.DTDate.apply(lambda x: x + pd.DateOffset(days=7-x.weekday()) if  x.weekday() else x)

的产率:

    Date_call     DTDate    weekday  weekdayNo    newDate
0  2014-12-31 2014-12-31  Wednesday          3 2015-01-05
1  2014-10-29 2014-10-29  Wednesday          3 2014-11-03
2  2014-10-28 2014-10-28    Tuesday          2 2014-11-03
3  2015-03-27 2015-03-27     Friday          5 2015-03-30
4  2015-02-27 2015-02-27     Friday          5 2015-03-02
5  2014-11-27 2014-11-27   Thursday          4 2014-12-01
6  2014-10-27 2014-10-27     Monday          1 2014-10-27
7  2015-03-26 2015-03-26   Thursday          4 2015-03-30
8  2015-02-26 2015-02-26   Thursday          4 2015-03-02
9  2014-12-26 2014-12-26     Friday          5 2014-12-29
10 2014-11-26 2014-11-26  Wednesday          3 2014-12-01
11 2014-10-26 2014-10-26     Sunday          0 2014-10-27
12 2015-03-25 2015-03-25  Wednesday          3 2015-03-30
13 2014-12-25 2014-12-25   Thursday          4 2014-12-29
14 2015-03-24 2015-03-24    Tuesday          2 2015-03-30
15 2015-02-24 2015-02-24    Tuesday          2 2015-03-02
16 2014-12-24 2014-12-24  Wednesday          3 2014-12-29
17 2014-11-24 2014-11-24     Monday          1 2014-11-24
18 2015-03-23 2015-03-23     Monday          1 2015-03-23

答案 1 :(得分:1)

AddDate函数可以更简单,实际上是单个衬里

In [34]: df['newDate'] = df['DTDate'].apply(lambda x: x + timedelta(days=7-x.dayofweek)
                                            if x.dayofweek else x)

此处,lambda函数lambda x: x + timedelta(days=7-x.dayofweek) if x.dayofweek else x如果不是星期一,则添加delta = 7-x.dayofweek天。

要验证新weekday,请创建新列newdayofweek

In [35]: df['newdayofweek'] = df['newDate'].apply(lambda x: x.dayofweek)

In [36]: df
Out[36]:
    Date        call     DTDate    weekday  weekdayNo    newDate  newdayofweek
0      0  31/12/2014 2014-12-31  Wednesday          3 2015-01-05             0
1      1  29/10/2014 2014-10-29  Wednesday          3 2014-11-03             0
2      2  28/10/2014 2014-10-28    Tuesday          2 2014-11-03             0
3      3   27/3/2015 2015-03-27     Friday          5 2015-03-30             0
4      4   27/2/2015 2015-02-27     Friday          5 2015-03-02             0
5      5  27/11/2014 2014-11-27   Thursday          4 2014-12-01             0
6      6  27/10/2014 2014-10-27     Monday          1 2014-10-27             0
7      7   26/3/2015 2015-03-26   Thursday          4 2015-03-30             0
8      8   26/2/2015 2015-02-26   Thursday          4 2015-03-02             0
9      9  26/12/2014 2014-12-26     Friday          5 2014-12-29             0
10    10  26/11/2014 2014-11-26  Wednesday          3 2014-12-01             0
11    11  26/10/2014 2014-10-26     Sunday          0 2014-10-27             0
12    12   25/3/2015 2015-03-25  Wednesday          3 2015-03-30             0
13    13  25/12/2014 2014-12-25   Thursday          4 2014-12-29             0
14    14   24/3/2015 2015-03-24    Tuesday          2 2015-03-30             0
15    15   24/2/2015 2015-02-24    Tuesday          2 2015-03-02             0
16    16  24/12/2014 2014-12-24  Wednesday          3 2014-12-29             0
17    17  24/11/2014 2014-11-24     Monday          1 2014-11-24             0
18    18   23/3/2015 2015-03-23     Monday          1 2015-03-23             0

注意:星期一= 0,星期日= 6

的星期几

答案 2 :(得分:0)

这是一种更高效的方法。

In [54]: %timeit s.apply(lambda x: x + pd.DateOffset(days=7-x.weekday()) if  x.weekday() else x)
1 loops, best of 3: 244 ms per loop

这基本上是在python空间中循环。

.where

这里我们要构建一个添加日期的TimedeltaIndex。 if-thenIn [55]: %timeit s.where(s.dt.weekday==0,pd.TimedeltaIndex(7-s.dt.weekday,unit='d')+s) 100 loops, best of 3: 9.69 ms per loop 相同,但这是一个矢量化表达式。

.owl