我有以下数据框:
start = ['31/12/2011 01:00','31/12/2011 01:00','31/12/2011 01:00','01/01/2013 08:00','31/12/2012 20:00']
end = ['02/01/2013 01:00','02/01/2014 01:00','02/01/2014 01:00','01/01/2013 14:00','01/01/2013 04:00']
df = pd.DataFrame({'start':start,'end':end})
df['start'] = pd.to_datetime(df['start'],format='%d/%m/%Y %H:%M')
df['end'] = pd.to_datetime(df['end'],format='%d/%m/%Y %H:%M')
print(df)
end start
0 2013-01-02 01:00:00 2011-12-31 01:00:00
1 2014-01-02 01:00:00 2011-12-31 01:00:00
2 2014-01-02 01:00:00 2011-12-31 01:00:00
3 2013-01-01 14:00:00 2013-01-01 08:00:00
4 2013-01-01 04:00:00 2012-12-31 20:00:00
我想将df.end
和df.start
与两个给定的日期year_start
和year_end
进行比较:
year_start = pd.to_datetime(2013,format='%Y')
year_end = pd.to_datetime(2013+1,format='%Y')
print(year_start)
print(year_end)
2013-01-01 00:00:00
2014-01-01 00:00:00
但是我无法进行比较(条件比较):
conditions = [(df['start'].any()< year_start) and (df['end'].any()> year_end)]
choices = [8760]
df['test'] = np.select(conditions, choices, default=0)
我还尝试了如下定义year_end
和year_start
的方法,但是它也不起作用:
year_start = np.datetime64(pd.to_datetime(2013,format='%Y'))
year_end = np.datetime64(pd.to_datetime(2013+1,format='%Y'))
关于如何使它起作用的任何想法?
答案 0 :(得分:1)
尝试一下:
In [797]: df[(df['start']< year_start) & (df['end']> year_end)]
Out[797]:
end start
1 2014-01-02 01:00:00 2011-12-31 01:00:00
2 2014-01-02 01:00:00 2011-12-31 01:00:00