How to iterate through Pandas rows and modify every cells based on it's rank in the row?

时间:2018-07-24 10:09:53

标签: python pandas

I would like to iterate through rows of a dataframe and modify the cell to True (False) based on the cell rank in it's row.

import pandas as pd
inp = [{'c1':10, 'c2':100, 'c3':50}, {'c1':11,'c2':110, 'c3':500}, {'c1':12,'c2':120, 'c3':5}]
df = pd.DataFrame(inp)
print (df)
   c1   c2   c3
0  10  100   50
1  11  110  500
2  12  120    5

I can iterate by rows and rank the Pandas series :

for index, row in df.iterrows():
    print(row.rank(ascending=True))

c1    1.0
c2    3.0
c3    2.0
Name: 0, dtype: float64
c1    1.0
c2    2.0
c3    3.0
Name: 1, dtype: float64
c1    2.0
c2    3.0
c3    1.0
Name: 2, dtype: float64

But I can't figure out how to modify cells to True (False) when rank is higher than (lower or equal) 2 so the final would be something like this :

print (res)
      c1    c2      c3
0  False  True   False
1  False  False   True
2  False  True   False

How can I achieve that ?

1 个答案:

答案 0 :(得分:1)

I think need rank with DataFrame.gt for >:

df = df.rank(ascending=True).gt(2)
print(df)
      c1     c2     c3
0  False  False  False
1  False  False   True
2   True   True  False

Detail:

print(df.rank(ascending=True))
    c1   c2   c3
0  1.0  1.0  2.0
1  2.0  2.0  3.0
2  3.0  3.0  1.0

EDIT:

For rank per rows add axis=1:

print(df.rank(ascending=True, axis=1))
    c1   c2   c3
0  1.0  3.0  2.0
1  1.0  2.0  3.0
2  2.0  3.0  1.0

df1 = df.rank(ascending=True, axis=1).gt(2)
print(df1)
      c1     c2     c3
0  False   True  False
1  False  False   True
2  False   True  False