我想根据列
中的值创建列名这就是我所拥有的:
part_number source recent_date recent_price
0023496 a1 2017-06-27 55.0
0023496 e1 2017-08-03 315.0
0023084 a1 2017-01-12 255.0
0023084 e1 NaN NaN
这是我想要的输出:
part_number a1_recent_date a1_recent_price e1_recent_date e1_recent_price
0023496 2017-06-27 55.0 2017-08-03 315.0
0023084 2017-01-12 255.0 NaN NaN
答案 0 :(得分:2)
使用set_index
和unstack
In [520]: dff = df.set_index(['part_number', 'source']).unstack()
In [521]: dff
Out[521]:
recent_date recent_price
source a1 e1 a1 e1
part_number
23084 2017-01-12 NaN 255.0 NaN
23496 2017-06-27 2017-08-03 55.0 315.0
然后,设置列名
In [522]: dff.columns = dff.columns.map(lambda x: '{1}_{0}'.format(*x))
In [523]: dff
Out[523]:
a1_recent_date e1_recent_date a1_recent_price e1_recent_price
part_number
23084 2017-01-12 NaN 255.0 NaN
23496 2017-06-27 2017-08-03 55.0 315.0
详细
In [527]: df
Out[527]:
part_number source recent_date recent_price
0 23496 a1 2017-06-27 55.0
1 23496 e1 2017-08-03 315.0
2 23084 a1 2017-01-12 255.0
3 23084 e1 NaN NaN
答案 1 :(得分:1)
这可以做到:
pd.concat([agg_df.add_prefix(index+'_').reset_index()
for index,agg_df in df.groupby('source', as_index=False)],
axis=1)
df.groupby('source', as_index=False)
for index,agg_df ...
agg_df.add_prefix(index+'_').reset_index()
pd.concat([...])
结果:
In [46]: pd.concat([agg_df.add_prefix(index+'_').reset_index()
...: for index,agg_df in df.groupby('source', as_index=False)],
...: axis=1)
Out[46]:
index a1_part_number a1_source a1_recent_date a1_recent_price index \
0 0 0023496 a1 2017-06-27 55.0 1
1 2 0023084 a1 2017-01-12 255.0 3
e1_part_number e1_source e1_recent_date e1_recent_price
0 0023496 e1 2017-08-03 315.0
1 0023084 e1 NaN NaN
In [47]: