我想将数据框的行与同一组合并,然后将值分配给新列
之前
x y group
0 0.333333 1.000000 0
1 0.750000 0.137931 0
2 1.000000 0.270115 0
3 0.272727 1.000000 1
4 0.727273 0.124294 1
5 1.000000 0.355932 1
6 0.272727 1.000000 2
7 0.727273 0.096591 2
8 1.000000 0.363636 2
9 0.272727 1.000000 3
10 0.727273 0.105556 3
11 1.000000 0.416667 3
12 0.272727 1.000000 4
13 0.727273 0.181818 4
14 1.000000 0.443182 4
之后
x1 y1 x2 y2 x3 y3
0.333333 1.000000 0.750000 0.137931 1.000000 0.270115
0.272727 1.000000 0.727273 0.124294 1.000000 0.355932
0.272727 1.000000 0.727273 0.096591 1.000000 0.363636
0.272727 1.000000 0.727273 0.105556 1.000000 0.416667
0.272727 1.000000 0.727273 0.181818 1.000000 0.443182
答案 0 :(得分:1)
这是pivot_table
的一种方式:
# rank of the row within each group
cats = df.groupby('group').group.rank('first').astype(int)
# use pivot_table to transform data
new_df = pd.pivot_table(df, index='group', columns=cats)
# rename to get desired columns
new_df.columns = [f'{x}{y}' for x,y in new_df.columns]
输出:
x1 x2 x3 y1 y2 y3
group
0 0.333333 0.750000 1.0 1.0 0.137931 0.270115
1 0.272727 0.727273 1.0 1.0 0.124294 0.355932
2 0.272727 0.727273 1.0 1.0 0.096591 0.363636
3 0.272727 0.727273 1.0 1.0 0.105556 0.416667
4 0.272727 0.727273 1.0 1.0 0.181818 0.443182
答案 1 :(得分:1)
不使用数据透视表:
groups=df.groupby('group')['group'].apply(lambda x: x.eq(x.shift()).cumsum())
new_df=pd.concat([df.groupby(groups)['x','y'].get_group(key).add_suffix(str(key+1)).reset_index(drop=True) for key in groups.unique()],axis=1)
print(new_df)
x1 y1 x2 y2 x3 y3
0 0.333333 1.0 0.750000 0.137931 1.0 0.270115
1 0.272727 1.0 0.727273 0.124294 1.0 0.355932
2 0.272727 1.0 0.727273 0.096591 1.0 0.363636
3 0.272727 1.0 0.727273 0.105556 1.0 0.416667
4 0.272727 1.0 0.727273 0.181818 1.0 0.443182
答案 2 :(得分:0)
您可以使用groupby和
之类的数据透视功能来完成此操作df = pd.DataFrame([('0', '0.333333', '1.000000', '0'), ('1', '0.750000', '0.137931', '0'), ('2', '1.000000', '0.270115', '0'), ('3', '0.272727', '1.000000', '1'), ('4', '0.727273', '0.124294', '1'), ('5', '1.000000', '0.355932', '1'), ('6', '0.272727', '1.000000', '2'), ('7', '0.727273', '0.096591', '2'), ('8', '1.000000', '0.363636', '2'), ('9', '0.272727', '1.000000', '3'), ('10', '0.727273', '0.105556', '3'), ('11', '1.000000', '0.416667', '3'), ('12', '0.272727', '1.000000', '4'), ('13', '0.727273', '0.181818', '4'), ('14', '1.000000', '0.443182', '4')], columns=('id', 'x', 'y', 'group'))
df2 = df.groupby("group") \
.apply(lambda group: group.assign(ind =np.arange(len(group)))) \
.reset_index(drop=True)
df2=df2.pivot(index="ind", columns="group", values=["x", "y"])
df2.columns = list(map(lambda x:"".join(x), df2.columns))
df2