具有可满足条件的最大值的DataFrame行

时间:2020-10-05 12:00:17

标签: python pandas dataframe conditional-statements

我的df看起来像这样:

index db id age score
1     1  1  1   2
2     1  1  2   1.5
3     1  2  2   3
4     1  2  3   4
5     2  1  2   3
6     2  1  1   1
7     2  2  3   2
8     2  2  5   3.5
9     3  1  4   4
...

我想获得每一对具有唯一性(db,id)对的最长寿命的行。 结果:

index db id age score
2     1  1  2   1.5
4     1  2  3   4
5     2  1  2   3
8     2  2  5   3.5
9     3  1  4   4

我使用了此功能,但是非常耗时:

def get_age_rel(main_df, age):
    data = []
    age_rel_df = main_df[main_df['age'] <= age]
    for db_index in np.unique(age_rel_df['db']):
        db_rel_df = age_rel_df[age_rel_df['db'] == db_index]
        for some_id in np.unique(db_rel_df['id']):
            data.append(max_rows(db_rel_df [db_rel_df ['id'] == some_id], 'age', 1))
    return pd.concat(data,axis=1)

def max_rows(df, col, n):

    max_indexes = df[col].nlargest(n)
    max_indexes = list(max_indexes.index)

    return df.loc[max_indexes]

0 个答案:

没有答案