应用错误收集

熊猫-为每个实例选择最大值

时间：2019-10-23 18:57:42

标签： python pandas

我有一个看起来像这样的Python熊猫数据框：

df = pd.DataFrame({'LATITUDE': [-22.22, -22.43, -22.22, -22.43, -22.35, -22.35, -22.35, -22.21]})
df['Importance'] = df.groupby('LATITUDE').cumcount().add(1)
df

如何生成仅具有每个实例最大值的另一个数据框？

输出示例：

LATITUDE | Importance
-22.22   | 2
-22.43   | 2
-22.35   | 3
-22.21   | 1

2 个答案:

答案 0 :(得分：1)

df.groupby('LATITUDE', as_index=False).max()

答案 1 :(得分：1)

这是您的追求吗？老实说，您的“重要性”列让我有些困惑。这是数据集的一部分，还是尝试解决您的问题？无论哪种方式，我都将其视为您数据集的一部分...

import pandas as pd

df = pd.DataFrame({'LATITUDE': [-22.22, -22.43, -22.22, -22.43, 
                                -22.35, -22.35, -22.35, -22.21]})
df['Importance'] = df.groupby('LATITUDE').cumcount().add(1)

# Return the LATITUDE and Importance with max Importance.
df2 = df.groupby('LATITUDE', as_index=False).agg({'Importance': max})

之前：

   LATITUDE  Importance
0    -22.22           1
1    -22.43           1
2    -22.22           2
3    -22.43           2
4    -22.35           1
5    -22.35           2
6    -22.35           3
7    -22.21           1

之后：

   LATITUDE  Importance
0    -22.43           2
1    -22.35           3
2    -22.22           2
3    -22.21           1

让我知道您是否正在追寻其他东西...