我的DataFrame 1看起来像这样:
ID group_1 area_1 group_2 area_2 group_3 area_3
1 basketball 250 scoccer 500 swimming 100
2 volleyball 100 np.nan np.nan np.nan np.nan
3 football 10 basketball 1000 np.nan np.nan
我还有另一个看起来像这样的DF2
ID group_1 area_1 group_2 area_2 group_3 area_3 group_4 area_4
1 scoccer 500 basketball 50 basketball 200 swimming 100
2 volleyball np.nan np.nan np.nan np.nan np.nan np.nan np.nan
3 basketball 1000 basketball np.nan football 10 np.nan np.nan
我想要的输出应如下所示:
ID group_1 area_1 group_2 area_2 group_3 area_3
1 scoccer 500 basketball 250 swimming 100
2 volleyball 100 np.nan np.nan np.nan np.nan
3 basketball 1000 football 10 np.nan np.nan
我想用DF2中的结构来布置DF1,这意味着第一步,我需要确定DF2中独特的水平表情(滑板车,篮球,游泳),其中重要的布置。然后按这种安排对DF1进行排序(但要保留来自area_x的正确值)。
编辑: 有了@kait的答案,final_df看起来像这样:
ID group_1 area_1 group_2 group_3 area_3 group_4 group_5 area_5 group_6
1 scoccer 500 500 basketball 250 250 swimming 100 100
2 volleyball 100 100 np.nan np.nan np.nan np.nan np.nan np.nan
3 basketball 1000 1000 football 10 10 np.nan np.nan np.nan
答案 0 :(得分:0)
这行吗?
首先,重塑df1
new_rows = []
for k, v in df.iterrows():
for group in range(1,4):
new_rows.append([v['ID'], v[f'group_{group}'], v[f'area_{group}']])
new_df = pd.DataFrame(new_rows, columns=['ID', 'group', 'area']).dropna()
display(new_df)
ID group area
0 1 basketball 250
1 1 scoccer 500
2 1 swimming 100
3 2 volleyball 100
6 3 football 10
7 3 basketball 1000
接下来,解析df2:
parsed_rows = []
def parse_df2(row):
x = {}
x['ID'] = row['ID']
groups = [v for k, v in row.items() if 'group' in k or k == 'ID']
deduped = [groups[i]
for i
in range(len(groups))
if (i == 0)
or groups[i] != groups[i - 1]]
print(deduped)
for k, v in enumerate(deduped):
if k == 0 or pd.isna(v):
continue
x[f'group_{k}'] = v
mask = new_df.ID == row['ID']
mask &= new_df.group == v
if new_df[mask].empty:
continue
x[f'area_{k}'] = new_df[mask]['area'].iloc[0]
parsed_rows.append(x)
df2.apply(lambda x: parse_df2(x), axis=1)
final_df = pd.DataFrame(parsed_rows)
display(final_df)
ID group_1 area_1 group_2 area_2 group_3 area_3
1 scoccer 500 basketball 250.0 swimming 100.0
2 volleyball 100 NaN NaN NaN NaN
3 basketball 1000 football 10.0 NaN NaN