Question

我有一个pandas数据框，其列=用户名行=餐厅名称。值是用户给定的等级。然后按均值排序。例如：

ratings = pd.DataFrame(data=[[1, 4], [5, 8], [7, 9], [3, 4], [8, 8], [6, 7], [5, 2], [4, 9]], 
                        index=['rest1', 'rest2', 'rest3', 'rest4', 'rest5', 'rest6', 'rest7', 'rest8'], 
                        columns=[user1, user2])

ratings_sorted = preds_db.sort_values(by='mean', ascending=False)

现在，在平局的情况下，我希望两个用户的最小值均较高的餐厅的排名更高。例如，rest2，rest6和rest8的平均值均为6.5，但我希望它们的排名为：rest6> rest2> rest8，因为rest6 =（6，7），rest2 =（5，8），rest8 =（4 ，9）。

我的计划是制作一个新列表，列出要使用的餐厅，并将其用作新索引。这是我超级混乱的尝试：

def highest_min(rest1, rest2, db):
    if db.loc[rest1].min() > db.loc[rest2].min():
        return [rest1, rest2]
    return [rest2, rest1]

def add_resorted_column(preds_db_sorted):
    resorted = []
    for i, rest in enumerate(preds_db_sorted.index):
        if i < len(preds_db_sorted.index)-1:
            if preds_db_sorted.iloc[i]['mean'] != preds_db_sorted.iloc[i+1]['mean']:
                if preds_db_sorted.index[i] not in resorted:
                    resorted.append(rest)
            else:
                resorted.extend(highest_min(
                            preds_db_sorted.index[i], 
                            preds_db_sorted.index[i+1], 
                            preds_db_sorted))
        else: 
            if preds_db_sorted.index[-1] not in resorted:
                resorted.append(preds_db_sorted.index[-1]) 
    return resorted

我知道一定有更好的方法。而且，当一个领带中有两个以上的餐厅时，这会产生重复的问题。另外，我想将其扩展为可用于两个以上的用户。谢谢！

Answer 1

只需将mean和min与concat一起使用并将它们排序在一起

idx=pd.concat([ratings.mean(1),ratings.min(1)],axis = 1).\
        sort_values([0,1],ascending=[True,False]).\
             index
ratings.loc[idx]
       user1  user2
rest1      1      4
rest4      3      4
rest7      5      2
rest6      6      7
rest2      5      8
rest8      4      9
rest5      8      8
rest3      7      9
ratings=ratings.loc[idx]

Answer 2

    import pandas as pd
    ratings = pd.DataFrame(data=[[1, 4], [5, 8], [7, 9], [3, 4], [8, 8], [6, 7], [5, 2], [4, 9]], 
                            index=['rest1', 'rest2', 'rest3', 'rest4', 'rest5', 'rest6', 'rest7', 'rest8'], 
                            columns=['user1', 'user2'])
    ratings['mean']=ratings.mean(axis=1)
    ratings['min']=ratings.min(axis=1)
    ratings_sorted = ratings.sort_values(by=['mean','min'], ascending=False)
    print(ratings_sorted)

按均值排序后，如何按最高最小值排序

2 个答案: