给出以下数据框:
import pandas as pd
df=pd.DataFrame({'A':['a','b','c'],
'first_date':['2015-08-31 00:00:00','2015-08-24 00:00:00','2015-08-25 00:00:00']})
df.first_date=pd.to_datetime(df.first_date) #(dtype='<M8[ns]')
df['last_date']=pd.to_datetime('5/6/2016') #(dtype='datetime64[ns]')
def fnl(x):
l = pd.date_range(x.loc['first_date'], x.loc['last_date'], freq='B')
return pd.Series([l])
df['range'] = df.apply(fnl, axis=1)
df
A first_date last_date range
0 a 2015-08-31 2016-05-06 DatetimeIndex(['2015-08-31', '2015-09-01', '20...
1 b 2015-08-24 2016-05-06 DatetimeIndex(['2015-08-24', '2015-08-25', '20...
2 c 2015-08-25 2016-05-06 DatetimeIndex(['2015-08-25', '2015-08-26', '20...
我希望从df ['range']中删除exc(下面)的日期,其中exc ['A']匹配df ['A'],对于落入其相应范围的每个日期(即如果exc ['A']中的日期在df ['A']的相应范围之外,则显然不能排除。
exc=pd.DataFrame({'A':['a','a','b','b','c','c'],
'Exclusions':['2014-12-30 00:00:00','2015-08-31 00:00:00',\
'2015-08-25 00:00:00','2015-10-20 00:00:00',\
'2015-08-26 00:00:00','2016-10-05 00:00:00']
})
exc
A Exclusions
0 a 2014-12-30 00:00:00
1 a 2015-08-31 00:00:00
2 b 2015-08-25 00:00:00
3 b 2015-10-20 00:00:00
4 c 2015-08-26 00:00:00
5 c 2016-10-05 00:00:00
期望的结果:
A first_date last_date range
0 a 2015-08-31 2016-05-06 DatetimeIndex(['2015-09-01', '2015-09-02', '20...
1 b 2015-08-24 2016-05-06 DatetimeIndex(['2015-08-24', '2015-08-26', '20...
2 c 2015-08-25 2016-05-06 DatetimeIndex(['2015-08-25', '2015-08-27', '20...
提前致谢!
答案 0 :(得分:1)
我认为您可以先concat
创建新列range
,然后按melt
重新塑造。然后merge
并使用屏蔽df._merge == 'left_only'
按boolean indexing
进行过滤:
import pandas as pd
df=pd.DataFrame({'A':['a','b','c'],
'first_date':['2015-08-31 00:00:00','2015-08-24 00:00:00','2015-08-25 00:00:00']})
df.first_date=pd.to_datetime(df.first_date) #(dtype='<M8[ns]')
df['last_date']=pd.to_datetime('5/6/2016') #(dtype='datetime64[ns]')
def fnl(x):
l = pd.date_range(x.loc['first_date'], x.loc['last_date'], freq='B')
return pd.Series(l)
df1 = df.apply(fnl, axis=1)
print (df1)
0 1 2 3 4 5 \
0 2015-08-31 2015-09-01 2015-09-02 2015-09-03 2015-09-04 2015-09-07
1 2015-08-24 2015-08-25 2015-08-26 2015-08-27 2015-08-28 2015-08-31
2 2015-08-25 2015-08-26 2015-08-27 2015-08-28 2015-08-31 2015-09-01
6 7 8 9 ... 175 \
0 2015-09-08 2015-09-09 2015-09-10 2015-09-11 ... 2016-05-02
1 2015-09-01 2015-09-02 2015-09-03 2015-09-04 ... 2016-04-25
2 2015-09-02 2015-09-03 2015-09-04 2015-09-07 ... 2016-04-26
176 177 178 179 180 181 \
0 2016-05-03 2016-05-04 2016-05-05 2016-05-06 NaT NaT
1 2016-04-26 2016-04-27 2016-04-28 2016-04-29 2016-05-02 2016-05-03
2 2016-04-27 2016-04-28 2016-04-29 2016-05-02 2016-05-03 2016-05-04
182 183 184
0 NaT NaT NaT
1 2016-05-04 2016-05-05 2016-05-06
2 2016-05-05 2016-05-06 NaT
[3 rows x 185 columns]
df = pd.concat([df,df1], axis=1)
df = pd.melt(df, id_vars=['A','first_date','last_date'], value_name='range')
df = df.dropna(subset=['range'])
print (df)
A first_date last_date variable range
0 a 2015-08-31 2016-05-06 0 2015-08-31
1 b 2015-08-24 2016-05-06 0 2015-08-24
2 c 2015-08-25 2016-05-06 0 2015-08-25
3 a 2015-08-31 2016-05-06 1 2015-09-01
4 b 2015-08-24 2016-05-06 1 2015-08-25
5 c 2015-08-25 2016-05-06 1 2015-08-26
6 a 2015-08-31 2016-05-06 2 2015-09-02
7 b 2015-08-24 2016-05-06 2 2015-08-26
8 c 2015-08-25 2016-05-06 2 2015-08-27
9 a 2015-08-31 2016-05-06 3 2015-09-03
10 b 2015-08-24 2016-05-06 3 2015-08-27
11 c 2015-08-25 2016-05-06 3 2015-08-28
12 a 2015-08-31 2016-05-06 4 2015-09-04
13 b 2015-08-24 2016-05-06 4 2015-08-28
14 c 2015-08-25 2016-05-06 4 2015-08-31
15 a 2015-08-31 2016-05-06 5 2015-09-07
16 b 2015-08-24 2016-05-06 5 2015-08-31
...
...
exc=pd.DataFrame({'A':['a','a','b','b','c','c'],
'Exclusions':['2014-12-30 00:00:00','2015-08-31 00:00:00',\
'2015-08-25 00:00:00','2015-10-20 00:00:00',\
'2015-08-26 00:00:00','2016-10-05 00:00:00']
})
#print (exc)
exc['Exclusions'] = pd.to_datetime(exc['Exclusions'])
df = (pd.merge(df, exc, left_on=['A', 'range'],
right_on=['A','Exclusions'],
indicator=True,
how='left'))
df = df[df._merge == 'left_only']
df = df.drop(['Exclusions','_merge'], axis=1)
print (df)
A first_date last_date variable range
1 b 2015-08-24 2016-05-06 0 2015-08-24
2 c 2015-08-25 2016-05-06 0 2015-08-25
3 a 2015-08-31 2016-05-06 1 2015-09-01
6 a 2015-08-31 2016-05-06 2 2015-09-02
7 b 2015-08-24 2016-05-06 2 2015-08-26
8 c 2015-08-25 2016-05-06 2 2015-08-27
9 a 2015-08-31 2016-05-06 3 2015-09-03
10 b 2015-08-24 2016-05-06 3 2015-08-27
11 c 2015-08-25 2016-05-06 3 2015-08-28
12 a 2015-08-31 2016-05-06 4 2015-09-04
13 b 2015-08-24 2016-05-06 4 2015-08-28
...
...