df = pd.DataFrame({"name":["A", "A","A", "A", "B" ,"B","B" ,"B", "C", "C","C", "C"],
"nickname":["X","Y","X","Z","X","Y","X","Y","Y", "X","Y", "Y"]})
如何将df按“名称”分组,并在每行中将“ X”后紧跟“ Y”的行中删除行?即在这种情况下应删除“ X”。
必需的输出:
1 A Y
2 A X
3 A Z
5 B Y
7 B Y
8 C Y
10 C Y
11 C Y
答案 0 :(得分:3)
使用DataFrameGroupBy.shift
进行每组移位,并用ne
比较!=
,按位OR
-|
进行链接,并按boolean indexing
进行过滤:
m = df.groupby('name')['nickname'].shift(-1).ne('Y') | df['nickname'].ne('X')
df = df[m]
print (df)
name nickname
1 A Y
2 A X
3 A Z
5 B Y
7 B Y
8 C Y
10 C Y
11 C Y
编辑:
df = pd.DataFrame({"name":["A", "A","A", "A", "B" ,"B","B" ,"B", "C", "C","C", "C"],
"nickname":["X","Y","X","Z","X","Y","X","X","Y", "X","Y", "Y"]})
print (df)
name nickname
0 A X
1 A Y
2 A X
3 A Z
4 B X
5 B Y
6 B X
7 B X
8 C Y
9 C X
10 C Y
11 C Y
m = df.groupby('name')['nickname'].shift(-1).ne('Y') | df['nickname'].ne('X')
df1 = df[m]
print (df1)
name nickname
1 A Y
2 A X
3 A Z
5 B Y
6 B X
7 B X
8 C Y
10 C Y
11 C Y
print(df[(df['nickname']!='X') | (df['nickname'].shift(-1)!='Y')])
name nickname
1 A Y
2 A X
3 A Z
5 B Y
6 B X
8 C Y
10 C Y
11 C Y
答案 1 :(得分:2)
为此使用df[...]
,过滤掉不需要的,并不需要分组:
print(df[(df['nickname']!='X') | (df['nickname'].shift(-1)!='Y')])
输出:
name nickname
1 A Y
2 A X
3 A Z
5 B Y
7 B Y
8 C Y
10 C Y
11 C Y
更新:
print(df[(df['nickname']!='X') | (df['nickname'].shift(-1).isin(['Y','Z'])==0)])