我有一个像这样的pandas数组:
x y z
35.013930 048.775597 0.22
42.015619 368.803652 0.00
03.017302 349.831709 1.20
05.018978 378.859767 2.20
07.020646 300.887827 0.05
23.022307 044.915887 0.23
. . .
. . .
. . .
有大约40,000行。
我需要删除数据(x, y)
不在y:(44,350.5)
和x:(4.5,35.8)
范围内的行。
因此,输出将是这样的:
x y z
35.013930 048.775597 0.22
07.020646 300.887827 0.05
23.022307 044.915887 0.23
. . .
. . .
我认为将np.where(np.logical_and())
与x, y
列一起使用可能是一种解决方案,但我不知道该怎么做。有谁知道解决方案?
答案 0 :(得分:1)
您可以使用loc
或query
。我尝试使用conditions
获取推荐输出,然后使用问题文本conditions
:
print df
# x y z
#0 35.013930 48.775597 0.22
#1 42.015619 368.803652 0.00
#2 3.017302 349.831709 1.20
#3 5.018978 378.859767 2.20
#4 7.020646 300.887827 0.05
#5 23.022307 44.915887 0.23
print df.loc[(df.y > 44) & (df.y < 350.5) & (df.x > 4.5) & (df.x < 35.8)]
# x y z
#0 35.013930 48.775597 0.22
#4 7.020646 300.887827 0.05
#5 23.022307 44.915887 0.23
print df.query('y > 44 and y < 350.5 and x > 4.5 and x < 35.8')
# x y z
#0 35.013930 48.775597 0.22
#4 7.020646 300.887827 0.05
#5 23.022307 44.915887 0.23
print df.loc[~((df.y > 44) & (df.y < 350.5) & (df.x > 4.5) & (df.x < 35.8))]
# x y z
#1 42.015619 368.803652 0.0
#2 3.017302 349.831709 1.2
#3 5.018978 378.859767 2.2
print df.query(' not (y > 44 and y < 350.5 and x > 4.5 and x < 35.8)')
# x y z
#1 42.015619 368.803652 0.0
#2 3.017302 349.831709 1.2
#3 5.018978 378.859767 2.2
print df
# x y z
#0 35.013930 48.775597 0.22
#1 42.015619 368.803652 0.00
#2 3.017302 349.831709 1.20
#3 5.018978 378.859767 2.20
#4 7.020646 300.887827 0.05
#5 23.022307 44.915887 0.23
print df.loc[(df.y > 44) & (df.y < 350.5) & (df.x > 4.5) & (df.x < 35.8)]
.reset_index(drop=True)
# x y z
#0 35.013930 48.775597 0.22
#1 7.020646 300.887827 0.05
#2 23.022307 44.915887 0.23
print df.query('y > 44 and y < 350.5 and x > 4.5 and x < 35.8')
.reset_index(drop=True)
# x y z
#0 35.013930 48.775597 0.22
#1 7.020646 300.887827 0.05
#2 23.022307 44.915887 0.23
print df.loc[~((df.y > 44) & (df.y < 350.5) & (df.x > 4.5) & (df.x < 35.8))]
.reset_index(drop=True)
# x y z
#0 42.015619 368.803652 0.0
#1 3.017302 349.831709 1.2
#2 5.018978 378.859767 2.2
print df.query(' not (y > 44 and y < 350.5 and x > 4.5 and x < 35.8)')
.reset_index(drop=True)
# x y z
#0 42.015619 368.803652 0.0
#1 3.017302 349.831709 1.2
#2 5.018978 378.859767 2.2