问题是: 为什么top_2_的值与top_2_is不同 - 换句话说 - 如果将apply函数的结果分配给列,为什么它的结果会出错?
编辑:我认为这个问题有点被误解了,我为它创造了另一个例子。 EDIT2:我使用Python 2.7.12 :: Anaconda 4.0.0(64位):: Pandas 0.18.0
import pandas as pd
d = {'one' : [1., 2., 3., 4.],
'two' : [4., 3., 2., 1.]}
df52 = pd.DataFrame(d)
top_1_should = df52.apply(lambda row: row.sort_values()[0], 1)
top_2_should = df52.apply(lambda row: row.sort_values()[1], 1)
df52['top_1_is'] = df52.apply(lambda row: row.sort_values()[0], 1)
df52['top_1_should'] = top_1_should
df52['top_2_is'] = df52.apply(lambda row: row.sort_values()[1], 1)
df52['top_2_should'] = top_2_should
print df52
one two top_1_is top_1_should top_2_is top_2_should
0 1.0 4.0 1.0 1.0 1.0 4.0
1 2.0 3.0 2.0 2.0 2.0 3.0
2 3.0 2.0 2.0 2.0 2.0 3.0
3 4.0 1.0 1.0 1.0 1.0 4.0
最佳, 扬
答案 0 :(得分:1)
我认为您可以将Series.sort_values
与values
一起用于中断对齐行:
print (df52.apply(lambda row: row.sort_values().values, axis=1))
one two
0 1.0 4.0
1 2.0 3.0
2 2.0 3.0
3 1.0 4.0
或者:
print (pd.DataFrame(np.sort(df52.values), df52.index, df52.columns))
one two
0 1.0 4.0
1 2.0 3.0
2 2.0 3.0
3 1.0 4.0
如果使用print
,则会获得排序输出 - 如果之前添加新列,则需要更改Series
中所选行的位置DataFrame
中的列:
top_1_should = df52.apply(lambda row: row.sort_values()[0], 1)
top_2_should = df52.apply(lambda row: row.sort_values()[1], 1)
df52['top_1_is'] = df52.apply(lambda row: row.sort_values()[0], 1)
df52['top_1_should'] = top_1_should
df52['top_2_is'] = df52.apply(lambda row: row.sort_values()[1], 1)
df52['top_2_is'] = df52.apply(lambda row: print(row.sort_values()), 1)
one 1.0
top_1_is 1.0
top_1_should 1.0
top_2_is 1.0
two 4.0
Name: 0, dtype: float64
one 2.0
top_1_is 2.0
top_1_should 2.0
top_2_is 2.0
two 3.0
Name: 1, dtype: float64
two 2.0
top_1_is 2.0
top_1_should 2.0
top_2_is 2.0
one 3.0
Name: 2, dtype: float64
two 1.0
top_1_is 1.0
top_1_should 1.0
top_2_is 1.0
one 4.0
Name: 3, dtype: float64
答案 1 :(得分:0)
import pandas as pd
d = {'one' : [1., 2., 3., 4.],
'two' : [2., 3., 4., 5.]}
df52 = pd.DataFrame(d)
top_1_should = df52.apply(lambda row: row.sort_values()[0], 1)
top_2_should = df52.apply(lambda row: row.sort_values()[1], 1)
df52['top_1_is'] = df52.apply(lambda row: row.sort_values()[0], 1)
df52['top_1_should'] = top_1_should
df52['top_2_is'] = df52.apply(lambda row: row.sort_values()[3], 1)
df52['top_2_should'] = top_2_should
print(df52)
返回:
one two top_1_is top_1_should top_2_is top_2_should
0 1 2 1 1 2 2
1 2 3 2 2 3 3
2 3 4 3 3 4 4
3 4 5 4 4 5 5