我有一个数据框,在一列(INSP_DATE2)中包含日期,下面是数据框。
我需要的是两个不同的列,其中WeekBegin(星期日的星期日)和WeekEnd(星期六的星期六)
INSP_DATE2 |WeekBegin |WeekEnd 7/23/2014 |WB 07/20/2014 |WE 07/26/2014 7/23/2014 |WB 07/20/2014 |WE 07/26/2014 7/23/2014 |WB 07/20/2014 |WE 07/26/2014 6/10/2014 |WB 06/08/2014 |WE 06/14/2014 6/10/2014 |WB 06/08/2014 |WE 06/14/2014 6/10/2014 |WB 06/08/2014 |WE 06/14/2014 6/10/2014 |WB 06/08/2014 |WE 06/14/2014
我倾向于远离apply方法,如果有人提出任何建议,包括numpy数组。或者也可以使用apply方法。
答案 0 :(得分:6)
似乎你需要:
df['INSP_DATE2'] = pd.to_datetime(df['INSP_DATE2'])
df['a'] = df['INSP_DATE2'] - pd.offsets.Week(weekday=6)
df['b'] = df['INSP_DATE2'] + pd.offsets.Week(weekday=5)
print (df)
INSP_DATE2 WeekBegin WeekEnd a b
0 2014-07-23 WB 07/20/2014 WE 07/26/2014 2014-07-20 2014-07-26
1 2014-07-23 WB 07/20/2014 WE 07/26/2014 2014-07-20 2014-07-26
2 2014-07-23 WB 07/20/2014 WE 07/26/2014 2014-07-20 2014-07-26
3 2014-06-10 WB 06/08/2014 WE 06/14/2014 2014-06-08 2014-06-14
4 2014-06-10 WB 06/08/2014 WE 06/14/2014 2014-06-08 2014-06-14
5 2014-06-10 WB 06/08/2014 WE 06/14/2014 2014-06-08 2014-06-14
6 2014-06-10 WB 06/08/2014 WE 06/14/2014 2014-06-08 2014-06-14
如果需要更改格式,请使用strftime
:
df['INSP_DATE2'] = pd.to_datetime(df['INSP_DATE2'])
df['a'] = (df['INSP_DATE2'] - pd.offsets.Week(weekday=6)).dt.strftime('WB %m/%d/%Y')
df['b'] = (df['INSP_DATE2'] + pd.offsets.Week(weekday=5)).dt.strftime('WE %m/%d/%Y')
print (df)
INSP_DATE2 WeekBegin WeekEnd a b
0 2014-07-23 WB 07/20/2014 WE 07/26/2014 WB 07/20/2014 WE 07/26/2014
1 2014-07-23 WB 07/20/2014 WE 07/26/2014 WB 07/20/2014 WE 07/26/2014
2 2014-07-23 WB 07/20/2014 WE 07/26/2014 WB 07/20/2014 WE 07/26/2014
3 2014-06-10 WB 06/08/2014 WE 06/14/2014 WB 06/08/2014 WE 06/14/2014
4 2014-06-10 WB 06/08/2014 WE 06/14/2014 WB 06/08/2014 WE 06/14/2014
5 2014-06-10 WB 06/08/2014 WE 06/14/2014 WB 06/08/2014 WE 06/14/2014
6 2014-06-10 WB 06/08/2014 WE 06/14/2014 WB 06/08/2014 WE 06/14/2014
编辑:
我在另一个样本中测试它并且存在小问题 - 确切的日期也改变了:
df = pd.DataFrame({'INSP_DATE2':pd.date_range('2017-08-02', periods=20)})
a = df['INSP_DATE2'] - pd.offsets.Week(weekday=6)
b = df['INSP_DATE2'] + pd.offsets.Week(weekday=5)
df['a'] = a
df['b'] = b
print (df)
INSP_DATE2 a b
0 2017-08-02 2017-07-30 2017-08-05
1 2017-08-03 2017-07-30 2017-08-05
2 2017-08-04 2017-07-30 2017-08-05
3 2017-08-05 2017-07-30 2017-08-12 <- 2017-08-05 is changed to 2017-08-12 (a)
4 2017-08-06 2017-07-30 2017-08-12 <- 2017-08-06 is changed to 2017-07-30 (b)
5 2017-08-07 2017-08-06 2017-08-12
6 2017-08-08 2017-08-06 2017-08-12
7 2017-08-09 2017-08-06 2017-08-12
8 2017-08-10 2017-08-06 2017-08-12
9 2017-08-11 2017-08-06 2017-08-12
10 2017-08-12 2017-08-06 2017-08-19
11 2017-08-13 2017-08-06 2017-08-19
12 2017-08-14 2017-08-13 2017-08-19
13 2017-08-15 2017-08-13 2017-08-19
14 2017-08-16 2017-08-13 2017-08-19
15 2017-08-17 2017-08-13 2017-08-19
16 2017-08-18 2017-08-13 2017-08-19
17 2017-08-19 2017-08-13 2017-08-26
18 2017-08-20 2017-08-13 2017-08-26
19 2017-08-21 2017-08-20 2017-08-26
解决方案有点复杂 - 需要mask
来检查与添加或减去一周相同的日期:
df = pd.DataFrame({'INSP_DATE2':pd.date_range('2017-08-02', periods=20)})
a = df['INSP_DATE2'] - pd.offsets.Week(weekday=6)
b = df['INSP_DATE2'] + pd.offsets.Week(weekday=5)
m1 = df['INSP_DATE2'] != (a + pd.offsets.Week())
m2 = df['INSP_DATE2'] != (b - pd.offsets.Week())
df['c'] = df['INSP_DATE2'].mask(m1, a)
df['d'] = df['INSP_DATE2'].mask(m2, b)
print (df)
INSP_DATE2 c d
0 2017-08-02 2017-07-30 2017-08-05
1 2017-08-03 2017-07-30 2017-08-05
2 2017-08-04 2017-07-30 2017-08-05
3 2017-08-05 2017-07-30 2017-08-05
4 2017-08-06 2017-08-06 2017-08-12
5 2017-08-07 2017-08-06 2017-08-12
6 2017-08-08 2017-08-06 2017-08-12
7 2017-08-09 2017-08-06 2017-08-12
8 2017-08-10 2017-08-06 2017-08-12
9 2017-08-11 2017-08-06 2017-08-12
10 2017-08-12 2017-08-06 2017-08-12
11 2017-08-13 2017-08-13 2017-08-19
12 2017-08-14 2017-08-13 2017-08-19
13 2017-08-15 2017-08-13 2017-08-19
14 2017-08-16 2017-08-13 2017-08-19
15 2017-08-17 2017-08-13 2017-08-19
16 2017-08-18 2017-08-13 2017-08-19
17 2017-08-19 2017-08-13 2017-08-19
18 2017-08-20 2017-08-20 2017-08-26
19 2017-08-21 2017-08-20 2017-08-26
答案 1 :(得分:1)
此函数采用日期并返回相应的周末 - 星期六和星期开始 - 星期天。由于您特别提到您需要在这两天之间的任何一天的星期日和星期六的日期
注意:我假设输入格式为'mm / dd / yyyy'
from datetime import date,timedelta
def week_start_end(n):
month, day, year = (int(x) for x in n.split('/'))
#d is the given date
d = date(year, month, day)
#0-monday 6-sunday
w = d.weekday()
#print(w)
if w<5:
week_end = d + timedelta(5-w)
week_start = d - timedelta(w+1)
elif w == 5:
week_end = d
week_start = d - timedelta(w+1)
else:
week_end = d + timedelta(6)
week_start = d
return week_start.strftime('%m/%d/%y'),week_end.strftime('%m/%d/%y')
假设df为数据帧
df['Week_Begin'],df['Week_End'] = zip(*df[INSP_DATE2].apply(week_start_end))
这将在数据框中创建两个新列