我有日期时间索引的数据框。我列出了规定其病情的三个日期清单。我想将数据框的每个日期与三个列表进行比较,并为该行分配一个字符串。
df =
index data
2019-02-04 14:52:00 73.923746
2019-02-05 10:48:00 73.335315
2019-02-05 11:28:00 72.021457
2019-02-06 10:49:00 72.367468
2019-02-07 10:16:00 73.434296
2019-02-14 10:54:00 73.094386
2019-02-27 12:08:00 70.930997
2019-02-28 12:41:00 70.444107
2019-02-28 13:21:00 70.426729
2019-03-29 11:29:00 70.758032
2019-04-29 11:29:00 70.758032
2019-12-14 14:30:00 73.515568
2019-12-23 10:54:00 72.812583
bad_dates = [dates_bwn_twodates('2019-03-22','2019-04-09'),'bad_day']
good_dates= [dates_bwn_twodates('2019-4-10','2019-4-29'),'good_day']
explist = [bad_dates,good_dates]
我想将df中的每个索引与上述两个列表进行比较,并产生一个新列以指示当天的状况。 我目前的代码
df['test'] = 'normal_day'
for i in explist:
for j in df.index:
if bool(set(i[0])&set(j.strftime('%Y-%m-%d'))) == True:
df['test'].loc[j] = i[1]
我目前的输出是
index data test
2019-02-04 14:52:00 73.923746 normal_day
2019-02-05 10:48:00 73.335315 normal_day
2019-02-05 11:28:00 72.021457 normal_day
2019-02-06 10:49:00 72.367468 normal_day
2019-02-07 10:16:00 73.434296 normal_day
2019-02-14 10:54:00 73.094386 normal_day
2019-02-27 12:08:00 70.930997 normal_day
2019-02-28 12:41:00 70.444107 normal_day
2019-02-28 13:21:00 70.426729 normal_day
2019-03-29 11:29:00 70.758032 normal_day
2019-04-29 11:29:00 70.758032 normal_day
2019-12-14 14:30:00 73.515568 normal_day
2019-12-23 10:54:00 72.812583 normal_day
我的代码无法正常工作。
答案 0 :(得分:2)
创建蒙版
bad = df['index'].between('2019-03-22', '2019-04-09')
good = df['index'].between('2019-04-10', '2019-04-29')
然后分配给他们
df['test'] = 'normal_day'
df.loc[bad, 'test'] = 'bad_day'
df.loc[good, 'test'] = 'good_day'