Question

我有一个由多个列组成的数据帧，然后是两个列x和y，它们都填充了从1到3的数字。我想删除x中的数字小于y中的数字。例如，如果在一行x = 1且y = 3，我想删除整行。这是我到目前为止编写的代码：

for num1 in df.x:
    for num2 in df.y:
        if (num1< num2):
            df.drop(df.iloc[num1], inplace = True)

但我一直收到错误：

labels ['new' 'active' 1 '1'] not contained in axis

Anyhelp非常感谢。谢谢！

Answer 1

我会在你的场景中避免循环，只使用.drop：

df.drop(df[df['x'] < df['y']].index, inplace=True)

示例：

df = pd.DataFrame({'x':np.random.randint(0,4,5), 'y':np.random.randint(0,4,5)})

>>> df
   x  y
0  1  2
1  2  1
2  3  1
3  2  1
4  1  3

df.drop(df[df['x'] < df['y']].index, inplace = True)

>>> df
   x  y
1  2  1
2  3  1
3  2  1

[编辑]：或者，更简单地说，不使用drop：

df=df[~(df['x'] < df['y'])]

Answer 2

写两个for循环是非常无效的，而你可以

只比较两列

[df['x'] >= df['y']]

这些返回一个布尔数组，您可以使用它来过滤数据框

df[df['x'] >= df['y']]

Answer 3

我认为更好的是使用boolean indexing或query将条件更改为>=：

df[df['x'] >= df['y']]

或者：

df = df.query('x >= y')

样品：

df = pd.DataFrame({'x':[1,2,3,2], 'y':[0,4,5,1]})
print (df)
   x  y
0  1  0
1  2  4
2  3  5
3  2  1

df = df[df['x'] >= df['y']]
print (df)
   x  y
0  1  0
3  2  1

熊猫：试图根据for循环删除行？

3 个答案: