我有一个pandas数据框,其中包含date,week_day,public_holiday和Weekend列。
Eigen
我需要添加一个额外的列,其中包含长周末标志。输出应如下所示。
weekday Date Public_Holiday? Weekend?
5 2015-01-10 no yes
0 2015-01-12 no no
1 2015-01-13 no no
2 2015-01-14 no no
3 2015-01-15 no no
4 2015-01-16 no no
5 2015-01-17 no yes
6 2015-01-18 no yes
0 2015-01-19 no no
1 2015-01-20 no no
2 2015-01-21 no no
3 2015-01-22 no no
4 2015-01-23 yes no
5 2015-01-24 no yes
6 2015-01-25 no yes
1 2015-01-27 no no
2 2015-01-28 no no
3 2015-01-29 no no
4 2015-01-30 no no
5 2015-01-31 no yes
0 2015-02-02 no no
1 2015-02-03 no no
2 2015-02-04 no no
3 2015-02-05 no no
4 2015-02-06 no no
5 2015-02-07 no yes
6 2015-02-08 no yes
0 2015-02-09 yes no
1 2015-02-10 no no
2 2015-02-11 no no
正常周末不视为长周末。仅在星期五或星期一,并且在某些情况下(星期四或星期二是假日),整个系列才被视为长周末。
这是我在下面尝试过的
long_weekend weekday Date Public_Holiday? Weekend?
0 5 2015-01-10 no yes
0 0 2015-01-12 no no
0 1 2015-01-13 no no
0 2 2015-01-14 no no
0 3 2015-01-15 no no
0 4 2015-01-16 no no
0 5 2015-01-17 no yes
0 6 2015-01-18 no yes
0 0 2015-01-19 no no
0 1 2015-01-20 no no
0 2 2015-01-21 no no
0 3 2015-01-22 no no
1 4 2015-01-23 yes no
1 5 2015-01-24 no yes
1 6 2015-01-25 no yes
0 1 2015-01-27 no no
0 2 2015-01-28 no no
0 3 2015-01-29 no no
0 4 2015-01-30 no no
0 5 2015-01-31 no yes
0 0 2015-02-02 no no
0 1 2015-02-03 no no
0 2 2015-02-04 no no
0 3 2015-02-05 no no
0 4 2015-02-06 no no
1 5 2015-02-07 no yes
1 6 2015-02-08 no yes
1 0 2015-02-09 yes no
0 1 2015-02-10 no no
0 2 2015-02-11 no no
这给了我以下输出,该输出甚至具有正常的工作日为1。
df['long_weekend'] = np.where((df['Public_Holiday?'] == 'yes') | (df['Weekend?'] == 'yes'), 1, 0)
df['weekday'] = df['Predicted_Date'].dt.dayofweek
df['long_weekend'] = np.where(((df['long_weekend'] == 1) & (df['weekday'] == 4)) | (df['long_weekend'] == 1) & (df['weekday'] == 0)), 'yes','no')
我该如何工作?任何帮助都会很棒。预先感谢。
答案 0 :(得分:2)
想法是通过shift
和cumsum
创建连续的组,并用map
和value_counts
对组的数量进行计数,并用2
进行更多过滤:>
long = (df['Public_Holiday?'] == 'yes') | (df['Weekend?'] == 'yes')
s = long.ne(long.shift()).cumsum()
df['long_weekend'] = np.where((s.map(s.value_counts()) > 2) & long, 1, 0)
print (df)
weekday Predicted_Date Public_Holiday? Weekend? long_weekend
0 5 2015-01-10 no yes 0
1 0 2015-01-12 no no 0
2 1 2015-01-13 no no 0
3 2 2015-01-14 no no 0
4 3 2015-01-15 no no 0
5 4 2015-01-16 no no 0
6 5 2015-01-17 no yes 0
7 6 2015-01-18 no yes 0
8 0 2015-01-19 no no 0
9 1 2015-01-20 no no 0
10 2 2015-01-21 no no 0
11 3 2015-01-22 no no 0
12 4 2015-01-23 yes no 1
13 5 2015-01-24 no yes 1
14 6 2015-01-25 no yes 1
15 1 2015-01-27 no no 0
16 2 2015-01-28 no no 0
17 3 2015-01-29 no no 0
18 4 2015-01-30 no no 0
19 5 2015-01-31 no yes 0
20 0 2015-02-02 no no 0
21 1 2015-02-03 no no 0
22 2 2015-02-04 no no 0
23 3 2015-02-05 no no 0
24 4 2015-02-06 no no 0
25 5 2015-02-07 no yes 1
26 6 2015-02-08 no yes 1
27 0 2015-02-09 yes no 1
28 1 2015-02-10 no no 0
29 2 2015-02-11 no no 0