将数据框的行与相同的组合并,然后将值分配给新列

时间:2019-09-30 16:25:43

标签: python python-3.x pandas pandas-groupby

我想将数据框的行与同一组合并,然后将值分配给新列

之前

     x          y         group
0   0.333333    1.000000    0
1   0.750000    0.137931    0
2   1.000000    0.270115    0
3   0.272727    1.000000    1
4   0.727273    0.124294    1
5   1.000000    0.355932    1
6   0.272727    1.000000    2
7   0.727273    0.096591    2
8   1.000000    0.363636    2
9   0.272727    1.000000    3
10  0.727273    0.105556    3
11  1.000000    0.416667    3
12  0.272727    1.000000    4
13  0.727273    0.181818    4
14  1.000000    0.443182    4

之后

     x1          y1         x2          y2              x3          y3
    0.333333    1.000000    0.750000    0.137931    1.000000    0.270115    
    0.272727    1.000000    0.727273    0.124294    1.000000    0.355932    
    0.272727    1.000000    0.727273    0.096591    1.000000    0.363636    
    0.272727    1.000000    0.727273    0.105556    1.000000    0.416667    
    0.272727    1.000000    0.727273    0.181818    1.000000    0.443182    

3 个答案:

答案 0 :(得分:1)

这是pivot_table的一种方式:

# rank of the row within each group
cats = df.groupby('group').group.rank('first').astype(int)

# use pivot_table to transform data
new_df = pd.pivot_table(df, index='group', columns=cats)

# rename to get desired columns
new_df.columns = [f'{x}{y}' for x,y in new_df.columns]

输出:

             x1        x2   x3   y1        y2        y3
group                                                  
0      0.333333  0.750000  1.0  1.0  0.137931  0.270115
1      0.272727  0.727273  1.0  1.0  0.124294  0.355932
2      0.272727  0.727273  1.0  1.0  0.096591  0.363636
3      0.272727  0.727273  1.0  1.0  0.105556  0.416667
4      0.272727  0.727273  1.0  1.0  0.181818  0.443182

答案 1 :(得分:1)

不使用数据透视表:

groups=df.groupby('group')['group'].apply(lambda x: x.eq(x.shift()).cumsum())
new_df=pd.concat([df.groupby(groups)['x','y'].get_group(key).add_suffix(str(key+1)).reset_index(drop=True) for key in groups.unique()],axis=1)
print(new_df)



         x1   y1        x2        y2   x3        y3
0  0.333333  1.0  0.750000  0.137931  1.0  0.270115
1  0.272727  1.0  0.727273  0.124294  1.0  0.355932
2  0.272727  1.0  0.727273  0.096591  1.0  0.363636
3  0.272727  1.0  0.727273  0.105556  1.0  0.416667
4  0.272727  1.0  0.727273  0.181818  1.0  0.443182

答案 2 :(得分:0)

您可以使用groupby和

之类的数据透视功能来完成此操作
df = pd.DataFrame([('0', '0.333333', '1.000000', '0'), ('1', '0.750000', '0.137931', '0'), ('2', '1.000000', '0.270115', '0'), ('3', '0.272727', '1.000000', '1'), ('4', '0.727273', '0.124294', '1'), ('5', '1.000000', '0.355932', '1'), ('6', '0.272727', '1.000000', '2'), ('7', '0.727273', '0.096591', '2'), ('8', '1.000000', '0.363636', '2'), ('9', '0.272727', '1.000000', '3'), ('10', '0.727273', '0.105556', '3'), ('11', '1.000000', '0.416667', '3'), ('12', '0.272727', '1.000000', '4'), ('13', '0.727273', '0.181818', '4'), ('14', '1.000000', '0.443182', '4')], columns=('id', 'x', 'y', 'group'))

df2 = df.groupby("group") \
    .apply(lambda group: group.assign(ind =np.arange(len(group)))) \
    .reset_index(drop=True)

df2=df2.pivot(index="ind", columns="group", values=["x", "y"])
df2.columns = list(map(lambda x:"".join(x), df2.columns))
df2