Question

我有一个来自DataFrame的熊猫df，它有多列（下面仅显示3个）和90,000行：

        Key        Date     Rating
0      123abc   08/19/2015    A
1      456def   04/23/2013    B-
2      123abc   06/10/2012    C
3      789ghi   01/04/2017    B
.        .           .        .
.        .           .        .
90000  999zzz   12/12/2012    D

我想创建一个单独的DataFrame，df_ratings，它有两列：Key和Rating List。在df_ratings中，Key列必须是唯一的，并且Rating List列中应包含Ratings中与{{1} 1}}。

Key

到目前为止，我使用的方法是：

df

鉴于我的数据集的大小，这需要多个小时才能运行。如何加快此过程的速度/改善我的代码？

Answer 1

尝试一下：

df = df.groupby(by=['Key'], as_index=False).agg({'Rating': list})
print(df)

      Key        Rating
0  123abc  [A, A, A, A]
1  123def           [C]
2  456def          [B-]
3  789ghi           [B]
4  999zzz           [D]

如何通过使用熊猫查找另一个DataFrame在DataFrame列中创建列表？

1 个答案: