Question

我想通过多个条件过滤numpy数组。我发现 this thread 并在我的数据集上测试了切片方法，但是我得到了意想不到的结果。好吧，至少对我来说，他们是出乎意料的，因为我可能只是在理解按位运算符或其他东西的功能时遇到问题：/

这样您就可以了解数据：

test.shape
>>(222988, 2)

stats.describe(all_output[:, 0])
>>DescribeResult(nobs=222988, minmax=(2.594e-05, 74.821), mean=11.106, variance=108.246, [...])

stats.describe(all_output[:, 1])
>>DescribeResult(nobs=222988, minmax=(0.001, 8.999), mean=3.484, variance=7.606, [...])

现在，做一些基本的过滤：

test1 = test[(test[:, 0] >= 30) & (test[:, 1] <= 2)] 

test1.shape
>>(337, 2)

这些实际上是我不想要在我的数据集中拥有的行，所以如果我做了我认为相反的行......

test2 = test[(test[:, 0] <= 30) & (test[:, 1] >= 2)] 

test2.shape
>>(112349, 2)

我希望结果是（222651,2）。我想我做了一些令人尴尬的简单错误的事情？这里有人能把我推向正确的方向吗？

谢谢！ -M

Answer 1

De morgans law：not (p and q) == (not p) *or* (not q)。无论如何，numpy is ~中的not运算符

 ~((test[:, 0] >= 30) & (test[:, 1] <= 2)) == ((test[:, 0] < 30) | (test[:, 1] > 2))

要么做你想做的事，例如

test1 = test[~((test[:, 0] >= 30) & (test[:, 1] <= 2))]

在多个条件下过滤numpy数组的行时出现问题

1 个答案: