将熊猫组中的值堆叠到新列中

时间:2019-11-06 16:58:19

标签: pandas pandas-groupby

我有一个大致像这样的数据框

data = [
    {'user_id': 1, 'week': 1, 'score': 1},
    {'user_id': 1, 'week': 2, 'score': 2},
    {'user_id': 1, 'week': 2, 'score': 3},
    {'user_id': 2, 'week': 1, 'score': 1},
    {'user_id': 2, 'week': 1, 'score': 1}]
df = pd.DataFrame(data)
+---------+------+-------+
| user_id | week | score |
+---------+------+-------+
|       1 |    1 |     1 |
|       1 |    2 |     2 |
|       1 |    2 |     3 |
|       2 |    1 |     1 |
|       2 |    1 |     1 |
+---------+------+-------+

我想按user_idweek对其进行分组,但是然后将每个组中的每个得分都分为一个新列,这样结果数据框架如下所示:

+---------+------+--------+--------+
| user_id | week | score1 | score2 |
+---------+------+--------+--------+
|       1 |    1 |      1 |        |
|       1 |    2 |      2 |      3 |
|       2 |    1 |      1 |      1 |
+---------+------+--------+--------+

分组依据很简单

df.groupby(['user_id', 'week'], as_index=False)

但是我看不到如何进行重塑

2 个答案:

答案 0 :(得分:3)

您可以将groupby.cumcount()assign()set_index()unstack()结合使用:

m=(df.assign(k=df.groupby(['user_id','week']).cumcount())
                             .set_index(['user_id','week','k']).unstack())
m.columns=[f'{a}_{b}' for a,b in m.columns]
print(m.reset_index())

   user_id  week  score_0  score_1
0        1     1      1.0      NaN
1        1     2      2.0      3.0
2        2     1      1.0      1.0

答案 1 :(得分:2)

我们还可以使用groupby + apply(list)apply(pd.Series)

new_df=( df.groupby(['user_id', 'week'])
           .score
           .apply(list)
           .apply(pd.Series)
           .add_prefix('score_')
           .reset_index() )
print(new_df)

   user_id  week  score_0  score_1
0        1     1      1.0      NaN
1        1     2      2.0      3.0
2        2     1      1.0      1.0