使用现有的公共假期和周末专栏为长周末创建一个新专栏

时间:2019-09-10 06:42:52

标签: python python-3.x pandas

我有一个pandas数据框,其中包含date,week_day,public_holiday和Weekend列。

Eigen

我需要添加一个额外的列,其中包含长周末标志。输出应如下所示。

 weekday Date           Public_Holiday? Weekend?
   5     2015-01-10              no      yes
   0     2015-01-12              no       no
   1     2015-01-13              no       no
   2     2015-01-14              no       no
   3     2015-01-15              no       no
   4     2015-01-16              no       no
   5     2015-01-17              no      yes
   6     2015-01-18              no      yes
   0     2015-01-19              no       no
   1     2015-01-20              no       no
   2     2015-01-21              no       no
   3     2015-01-22              no       no
   4     2015-01-23              yes      no
   5     2015-01-24              no      yes
   6     2015-01-25              no      yes
   1     2015-01-27              no       no
   2     2015-01-28              no       no
   3     2015-01-29              no       no
   4     2015-01-30              no       no
   5     2015-01-31              no      yes
   0     2015-02-02              no       no
   1     2015-02-03              no       no
   2     2015-02-04              no       no
   3     2015-02-05              no       no
   4     2015-02-06              no       no
   5     2015-02-07              no      yes
   6     2015-02-08              no      yes
   0     2015-02-09              yes      no
   1     2015-02-10              no       no
   2     2015-02-11              no       no

正常周末不视为长周末。仅在星期五或星期一,并且在某些情况下(星期四或星期二是假日),整个系列才被视为长周末。

这是我在下面尝试过的

    long_weekend  weekday   Date          Public_Holiday? Weekend?
            0        5     2015-01-10              no      yes
            0        0     2015-01-12              no       no
            0        1     2015-01-13              no       no
            0        2     2015-01-14              no       no
            0        3     2015-01-15              no       no
            0        4     2015-01-16              no       no
            0        5     2015-01-17              no      yes
            0        6     2015-01-18              no      yes
            0        0     2015-01-19              no       no
            0        1     2015-01-20              no       no
            0        2     2015-01-21              no       no
            0        3     2015-01-22              no       no
            1        4     2015-01-23              yes      no
            1        5     2015-01-24              no      yes
            1        6     2015-01-25              no      yes
            0        1     2015-01-27              no       no
            0        2     2015-01-28              no       no
            0        3     2015-01-29              no       no
            0        4     2015-01-30              no       no
            0        5     2015-01-31              no      yes
            0        0     2015-02-02              no       no
            0        1     2015-02-03              no       no
            0        2     2015-02-04              no       no
            0        3     2015-02-05              no       no
            0        4     2015-02-06              no       no
            1        5     2015-02-07              no      yes
            1        6     2015-02-08              no      yes
            1        0     2015-02-09              yes      no
            0        1     2015-02-10              no       no
            0        2     2015-02-11              no       no

这给了我以下输出,该输出甚至具有正常的工作日为1。

df['long_weekend'] = np.where((df['Public_Holiday?'] == 'yes') | (df['Weekend?'] == 'yes'), 1, 0)
df['weekday'] = df['Predicted_Date'].dt.dayofweek
df['long_weekend'] = np.where(((df['long_weekend'] == 1) & (df['weekday'] == 4)) | (df['long_weekend'] == 1) & (df['weekday'] == 0)), 'yes','no')

我该如何工作?任何帮助都会很棒。预先感谢。

1 个答案:

答案 0 :(得分:2)

想法是通过shiftcumsum创建连续的组,并用mapvalue_counts对组的数量进行计数,并用2进行更多过滤:

long = (df['Public_Holiday?'] == 'yes') | (df['Weekend?'] == 'yes')
s = long.ne(long.shift()).cumsum()
df['long_weekend'] = np.where((s.map(s.value_counts()) > 2) & long, 1, 0)

print (df)
    weekday Predicted_Date Public_Holiday? Weekend?  long_weekend
0         5     2015-01-10              no      yes             0
1         0     2015-01-12              no       no             0
2         1     2015-01-13              no       no             0
3         2     2015-01-14              no       no             0
4         3     2015-01-15              no       no             0
5         4     2015-01-16              no       no             0
6         5     2015-01-17              no      yes             0
7         6     2015-01-18              no      yes             0
8         0     2015-01-19              no       no             0
9         1     2015-01-20              no       no             0
10        2     2015-01-21              no       no             0
11        3     2015-01-22              no       no             0
12        4     2015-01-23             yes       no             1
13        5     2015-01-24              no      yes             1
14        6     2015-01-25              no      yes             1
15        1     2015-01-27              no       no             0
16        2     2015-01-28              no       no             0
17        3     2015-01-29              no       no             0
18        4     2015-01-30              no       no             0
19        5     2015-01-31              no      yes             0
20        0     2015-02-02              no       no             0
21        1     2015-02-03              no       no             0
22        2     2015-02-04              no       no             0
23        3     2015-02-05              no       no             0
24        4     2015-02-06              no       no             0
25        5     2015-02-07              no      yes             1
26        6     2015-02-08              no      yes             1
27        0     2015-02-09             yes       no             1
28        1     2015-02-10              no       no             0
29        2     2015-02-11              no       no             0