Question

问题是：为什么top_2_的值与top_2_is不同 - 换句话说 - 如果将apply函数的结果分配给列，为什么它的结果会出错？

编辑：我认为这个问题有点被误解了，我为它创造了另一个例子。 EDIT2：我使用Python 2.7.12 :: Anaconda 4.0.0（64位）:: Pandas 0.18.0

import pandas as pd

d = {'one' : [1., 2., 3., 4.],
     'two' : [4., 3., 2., 1.]}
df52 = pd.DataFrame(d)

top_1_should = df52.apply(lambda row: row.sort_values()[0], 1)
top_2_should = df52.apply(lambda row: row.sort_values()[1], 1)
df52['top_1_is'] = df52.apply(lambda row: row.sort_values()[0], 1)
df52['top_1_should'] = top_1_should
df52['top_2_is'] = df52.apply(lambda row: row.sort_values()[1], 1)
df52['top_2_should'] = top_2_should
print df52

   one  two  top_1_is  top_1_should  top_2_is  top_2_should
0  1.0  4.0       1.0           1.0       1.0           4.0
1  2.0  3.0       2.0           2.0       2.0           3.0
2  3.0  2.0       2.0           2.0       2.0           3.0
3  4.0  1.0       1.0           1.0       1.0           4.0

最佳，扬

Answer 1

我认为您可以将Series.sort_values与values一起用于中断对齐行：

print (df52.apply(lambda row: row.sort_values().values, axis=1))
   one  two
0  1.0  4.0
1  2.0  3.0
2  2.0  3.0
3  1.0  4.0

或者：

print (pd.DataFrame(np.sort(df52.values), df52.index, df52.columns))
   one  two
0  1.0  4.0
1  2.0  3.0
2  2.0  3.0
3  1.0  4.0

如果使用print，则会获得排序输出 - 如果之前添加新列，则需要更改Series中所选行的位置DataFrame中的列：

top_1_should = df52.apply(lambda row: row.sort_values()[0], 1)
top_2_should = df52.apply(lambda row: row.sort_values()[1], 1)
df52['top_1_is'] = df52.apply(lambda row: row.sort_values()[0], 1)
df52['top_1_should'] = top_1_should
df52['top_2_is'] = df52.apply(lambda row: row.sort_values()[1], 1)
df52['top_2_is'] = df52.apply(lambda row: print(row.sort_values()), 1)
one             1.0
top_1_is        1.0
top_1_should    1.0
top_2_is        1.0
two             4.0
Name: 0, dtype: float64
one             2.0
top_1_is        2.0
top_1_should    2.0
top_2_is        2.0
two             3.0
Name: 1, dtype: float64
two             2.0
top_1_is        2.0
top_1_should    2.0
top_2_is        2.0
one             3.0
Name: 2, dtype: float64
two             1.0
top_1_is        1.0
top_1_should    1.0
top_2_is        1.0
one             4.0
Name: 3, dtype: float64

Answer 2

import pandas as pd

d = {'one' : [1., 2., 3., 4.],
     'two' : [2., 3., 4., 5.]}
df52 = pd.DataFrame(d)

top_1_should = df52.apply(lambda row: row.sort_values()[0], 1)
top_2_should = df52.apply(lambda row: row.sort_values()[1], 1)
df52['top_1_is'] = df52.apply(lambda row: row.sort_values()[0], 1)
df52['top_1_should'] = top_1_should
df52['top_2_is'] = df52.apply(lambda row: row.sort_values()[3], 1)
df52['top_2_should'] = top_2_should
print(df52)

返回：

  one  two  top_1_is  top_1_should  top_2_is  top_2_should
0    1    2         1             1         2             2
1    2    3         2             2         3             3
2    3    4         3             3         4             4
3    4    5         4             4         5             5

df [column] = apply（lambda row：row.sort_values（）[1]）表现得很奇怪

2 个答案: