我每天都有多个数据点。我需要检测每天的前0个。我想将数据转换为输出列。
可重现格式的数据:
Date,Data,Output
1/1/2019,1,False
1/1/2019,1,False
1/1/2019,0,True
1/1/2019,0,False
1/1/2019,1,False
2/1/2019,1,False
2/1/2019,0,True
2/1/2019,1,False
3/1/2019,0,True
3/1/2019,0,False
我认为这可能涉及groupby功能,但努力弄清楚如何开始。
答案 0 :(得分:3)
使用duplicated
:
df["output"] = ~(df[df["Data"]==0].duplicated(subset=["Date","Data"],keep="first"))
df["output"].fillna(False, inplace=True)
print (df)
#
Date Data output
0 1/01/2019 1 False
1 1/01/2019 1 False
2 1/01/2019 0 True
3 1/01/2019 0 False
4 1/01/2019 1 False
5 2/01/2019 1 False
6 2/01/2019 0 True
7 2/01/2019 1 False
8 3/01/2019 0 True
9 3/01/2019 0 False
答案 1 :(得分:0)
尝试2个布尔型口罩
m = df.Data.eq(0)
m1 = m.groupby(df.Date).cumsum().eq(1)
df['New'] = m & m1
Out[834]:
Date Data New
0 1/01/2019 1 False
1 1/01/2019 1 False
2 1/01/2019 0 True
3 1/01/2019 0 False
4 1/01/2019 1 False
5 2/01/2019 1 False
6 2/01/2019 0 True
7 2/01/2019 1 False
8 3/01/2019 0 True
9 3/01/2019 0 False
答案 2 :(得分:0)
另一个使用loc的groupby解决方案
df.loc[df[df.data.eq(0)].groupby('date').data.idxmin(), 'out'] = True
df = df.fillna(False)