使用列值的函数对pandas DataFrame进行排序

时间:2016-07-29 15:48:44

标签: python sorting pandas dataframe

基于python, sort descending dataframe with pandas

假设:

from pandas import DataFrame
import pandas as pd

d = {'one':[2,3,1,4,5],
     'two':[5,4,3,2,1],
     'letter':['a','a','b','b','c']}

df = DataFrame(d)

df然后看起来像这样:

df:
      letter  one  two
    0      a    2    5
    1      a    3    4
    2      b    1    3
    3      b    4    2
    4      c    5    1

我希望有类似的东西:

f = lambda x,y: x**2 + y**2
test = df.sort(f('one', 'two'))

这应该按照列'一个'的平方值之和来排序完整的数据帧。和'两个'并告诉我:

test:
      letter  one  two
    2      b    1    3
    3      b    4    2
    1      a    3    4
    4      c    5    1
    0      a    2    5

升序或降序无关紧要。有一个很好而简单的方法吗?我还没找到解决办法。

5 个答案:

答案 0 :(得分:17)

您可以创建要在排序中使用的临时列,然后将其删除:

df.assign(f = df['one']**2 + df['two']**2).sort_values('f').drop('f', axis=1)
Out: 
  letter  one  two
2      b    1    3
3      b    4    2
1      a    3    4
4      c    5    1
0      a    2    5

答案 1 :(得分:4)

public IActionResult Index()
{
     return this.View(new AnswerViewModel(this.AnswerRepository.GetAll()));
}

[HttpPost]
public IActionResult UpVote(Guid answerId)
{
     //Retrieve the answer that has been voted for
     Answer answer = this.AnswerRepository.FindById(answerId);
     //Add the vote
     answer.UpVote();
     //Save changes
     this.AnswerRepository.SaveChanges();
     // Return 200 OK
     return this.Ok();
}

[HttpPost]
public IActionResult DownVote(Guid answerId)
{
     //Retrieve the answer that has been voted for
     Answer answer = this.AnswerRepository.FindById(answerId);
     //Add the vote
     answer.DownVote();
     //Save changes
     this.AnswerRepository.SaveChanges();
     // Return 200 OK
     return this.Ok();
}

How to sort pandas dataframe by custom order on string index之后的

答案 2 :(得分:1)

您是否尝试过创建新列然后对其进行排序。我无法评论原帖,所以我只是发布我的解决方案。

df['c'] = df.a**2 + df.b**2
df = df.sort_values('c')

答案 3 :(得分:1)

from pandas import DataFrame
import pandas as pd

d = {'one':[2,3,1,4,5],
     'two':[5,4,3,2,1],
     'letter':['a','a','b','b','c']}

df = pd.DataFrame(d)

#f = lambda x,y: x**2 + y**2
array = []
for i in range(5):
    array.append(df.ix[i,1]**2 + df.ix[i,2]**2)
array = pd.DataFrame(array, columns = ['Sum of Squares'])
test = pd.concat([df,array],axis = 1, join = 'inner')
test = test.sort_index(by = "Sum of Squares", ascending = True).drop('Sum of Squares',axis =1)

刚才意识到你想要这个:

    letter  one  two
2      b    1    3
3      b    4    2
1      a    3    4
4      c    5    1
0      a    2    5

答案 4 :(得分:0)

另一种类似于this one的方法是使用argsort,它直接返回索引排列:

f = lambda r: r.x**2 + r.y**2
df.iloc[df.apply(f, axis=1).argsort()]

我认为使用argsort可以比常规的sort更好地转换想法(我们不在乎这种计算的价值,而只在乎结果索引)。

修补DataFrame以添加此功能也可能很有趣:

def apply_sort(self, *, key):
    return self.iloc[self.apply(key, axis=1).argsort()]

pd.DataFrame.apply_sort = apply_sort

然后我们可以简单地写:

>>> df.apply_sort(key=f)

   x  y letter
2  1  3      b
3  4  2      b
1  3  4      a
4  5  1      c
0  2  5      a