我需要根据参考列或索引(行)中的条件使用where函数来过滤pandas DataFrame。
似乎按列条件,它可以成功,但是使用类似方法的索引(行)会失败。
问题是:这是预期的行为吗?如果是这样,如何将过滤器应用于索引(行)?
import pandas as pd
import numpy as np
from pandas import Series, DataFrame
%matplotlib inline
mydict={}
cols=4
rows=4
for i in range(cols):
mydict[chr(ord('w')+i)]=np.random.randint(0,100,rows)
mydict
df=DataFrame(mydict,index=map(lambda x:chr(97+x), range(rows)))
print(df)
print("Filter all data if the column:w has even data ... WORKING")
print(df.loc[:,'w']%2==0)
print(df.where(lambda x: x.loc[:,'w']%2==0))
print("Filter all data if the index:a has even data ... NOT WORKING")
print(df.loc['a',:]%2==0)
print(df.where(lambda x: x.loc['a',:]%2==0, axis=1))
print(df.where(lambda x: x.loc['a',:]%2==0, axis=0))
pd.__version__
结果:
w x y z
a 42 98 74 51
b 69 82 70 40
c 93 7 78 45
d 22 61 70 4
Filter all data if the column:w has even data ... WORKING
a True
b False
c False
d True
Name: w, dtype: bool
w x y z
a 42.0 98.0 74.0 51.0
b NaN NaN NaN NaN
c NaN NaN NaN NaN
d 22.0 61.0 70.0 4.0
Filter all data if the index:a has even data ... NOT WORKING
w True
x True
y True
z False
Name: a, dtype: bool
w x y z
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN
d NaN NaN NaN NaN
w x y z
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN
d NaN NaN NaN NaN
'0.21.1'
参考:
答案 0 :(得分:0)
这可能是个错误。双转置很像通过轴。解决方法是
df.T.where(df.loc['a',:]%2==0).T
# This should be same as passing the `axis = 1`. It probably is a bug I guess
w x y z
a NaN 80.0 18.0 14.0
b NaN 98.0 12.0 26.0
c NaN 22.0 51.0 81.0
d NaN 57.0 99.0 23.0