我关注DataFrame
:
A B
0 1 5
1 2 3
2 3 2
3 4 0
4 5 1
如何通过列A
的条件值获取?
例如,所有大于3且小于6的值。
答案 0 :(得分:1)
您可以使用boolean indexing
,条件是您的间隔的终点
df[(df.A > 3) & (df.A < 6)]
或便利方法.between()
,其后面的内容转换为上述内容(因此速度非常慢),默认情况下您需要注意限制是否包含在内:
df[df.A.between(4, 5)] # uses inclusive limits
得到:
A B
3 4 0
4 5 1
答案 1 :(得分:0)
between
使用boolean indexing
(可能使用参数inclusive=False
):
print (df[df.A.between(4,5)])
样品:
df = pd.DataFrame({'A': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5,5: 6},
'B': {0: 5, 1: 3, 2: 2, 3: 0, 4: 2, 5: 1}})
print (df)
A B
0 1 5
1 2 3
2 3 2
3 4 0
4 5 2
5 6 1
print (df[df.A.between(4,5)]) #default inclusive=True
A B
3 4 0
4 5 2
print (df[df.A.between(3,6, inclusive=False)])
A B
3 4 0
4 5 2
时间相同:
df = pd.concat([df]*10000).reset_index(drop=True)
In [427]: %timeit (df[df.A.between(3,6, inclusive=False)])
The slowest run took 4.72 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 1.32 ms per loop
In [428]: %timeit (df[(df.A>3) & (df.A<6)])
1000 loops, best of 3: 1.31 ms per loop