熊猫:删除组行,其中该组的特定值后跟另一个值

时间:2018-12-07 08:14:52

标签: python pandas dataframe

df = pd.DataFrame({"name":["A", "A","A", "A", "B" ,"B","B" ,"B", "C", "C","C", "C"],
                   "nickname":["X","Y","X","Z","X","Y","X","Y","Y", "X","Y", "Y"]})

如何将df按“名称”分组,并在每行中将“ X”后紧跟“ Y”的行中删除行?即在这种情况下应删除“ X”。

必需的输出:

1     A        Y
2     A        X
3     A        Z
5     B        Y
7     B        Y
8     C        Y
10    C        Y
11    C        Y

2 个答案:

答案 0 :(得分:3)

使用DataFrameGroupBy.shift进行每组移位,并用ne比较!=,按位OR-|进行链接,并按boolean indexing进行过滤:

m = df.groupby('name')['nickname'].shift(-1).ne('Y') | df['nickname'].ne('X')

df = df[m]
print (df)
   name nickname
1     A        Y
2     A        X
3     A        Z
5     B        Y
7     B        Y
8     C        Y
10    C        Y
11    C        Y

编辑:

df = pd.DataFrame({"name":["A", "A","A", "A", "B" ,"B","B" ,"B", "C", "C","C", "C"],
                   "nickname":["X","Y","X","Z","X","Y","X","X","Y", "X","Y", "Y"]})

print (df)
   name nickname
0     A        X
1     A        Y
2     A        X
3     A        Z
4     B        X
5     B        Y
6     B        X
7     B        X
8     C        Y
9     C        X
10    C        Y
11    C        Y

m = df.groupby('name')['nickname'].shift(-1).ne('Y') | df['nickname'].ne('X')

df1 = df[m]
print (df1)
   name nickname
1     A        Y
2     A        X
3     A        Z
5     B        Y
6     B        X
7     B        X
8     C        Y
10    C        Y
11    C        Y

print(df[(df['nickname']!='X') | (df['nickname'].shift(-1)!='Y')])
   name nickname
1     A        Y
2     A        X
3     A        Z
5     B        Y
6     B        X
8     C        Y
10    C        Y
11    C        Y

答案 1 :(得分:2)

为此使用df[...],过滤掉不需要的,并不需要分组:

print(df[(df['nickname']!='X') | (df['nickname'].shift(-1)!='Y')])

输出:

   name nickname
1     A        Y
2     A        X
3     A        Z
5     B        Y
7     B        Y
8     C        Y
10    C        Y
11    C        Y

更新:

print(df[(df['nickname']!='X') | (df['nickname'].shift(-1).isin(['Y','Z'])==0)])