通过索引和列过滤DataFrame时pandas的不同行为

时间:2017-12-21 07:57:47

标签: pandas

我需要根据参考列或索引(行)中的条件使用where函数来过滤pandas DataFrame。

似乎按列条件,它可以成功,但是使用类似方法的索引(行)会失败。

问题是:这是预期的行为吗?如果是这样,如何将过滤器应用于索引(行)?

import pandas as pd
import numpy as np
from pandas import Series, DataFrame
%matplotlib inline
mydict={}
cols=4
rows=4
for i in range(cols):
    mydict[chr(ord('w')+i)]=np.random.randint(0,100,rows)
mydict
df=DataFrame(mydict,index=map(lambda x:chr(97+x), range(rows)))
print(df)
print("Filter all data if the column:w has even data ... WORKING")
print(df.loc[:,'w']%2==0)
print(df.where(lambda x: x.loc[:,'w']%2==0))

print("Filter all data if the index:a has even data ... NOT WORKING")
print(df.loc['a',:]%2==0)
print(df.where(lambda x: x.loc['a',:]%2==0, axis=1))
print(df.where(lambda x: x.loc['a',:]%2==0, axis=0))
pd.__version__

结果:

    w   x   y   z
a  42  98  74  51
b  69  82  70  40
c  93   7  78  45
d  22  61  70   4
Filter all data if the column:w has even data ... WORKING
a     True
b    False
c    False
d     True
Name: w, dtype: bool
      w     x     y     z
a  42.0  98.0  74.0  51.0
b   NaN   NaN   NaN   NaN
c   NaN   NaN   NaN   NaN
d  22.0  61.0  70.0   4.0
Filter all data if the index:a has even data ... NOT WORKING
w     True
x     True
y     True
z    False
Name: a, dtype: bool
    w   x   y   z
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN
d NaN NaN NaN NaN
    w   x   y   z
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN
d NaN NaN NaN NaN

'0.21.1'

参考:

https://stackoverflow.com/a/44736467/3598703

1 个答案:

答案 0 :(得分:0)

这可能是个错误。双转置很像通过轴。解决方法是

df.T.where(df.loc['a',:]%2==0).T 
# This should be same as passing the `axis = 1`. It probably is a bug I guess

   w     x     y     z
a NaN  80.0  18.0  14.0
b NaN  98.0  12.0  26.0
c NaN  22.0  51.0  81.0
d NaN  57.0  99.0  23.0