Question

我想根据列

中的值创建列名

这就是我所拥有的：

part_number source  recent_date  recent_price
    0023496     a1  2017-06-27    55.0
    0023496     e1  2017-08-03    315.0
    0023084     a1  2017-01-12    255.0
    0023084     e1    NaN           NaN

这是我想要的输出：

part_number a1_recent_date   a1_recent_price   e1_recent_date e1_recent_price

0023496     2017-06-27       55.0               2017-08-03        315.0
0023084     2017-01-12      255.0                  NaN             NaN

Answer 1

使用set_index和unstack

In [520]: dff = df.set_index(['part_number', 'source']).unstack()

In [521]: dff
Out[521]:
            recent_date             recent_price
source               a1          e1           a1     e1
part_number
23084        2017-01-12         NaN        255.0    NaN
23496        2017-06-27  2017-08-03         55.0  315.0

然后，设置列名

In [522]: dff.columns = dff.columns.map(lambda x: '{1}_{0}'.format(*x))

In [523]: dff
Out[523]:
            a1_recent_date e1_recent_date  a1_recent_price  e1_recent_price
part_number
23084           2017-01-12            NaN            255.0              NaN
23496           2017-06-27     2017-08-03             55.0            315.0

详细

In [527]: df
Out[527]:
   part_number source recent_date  recent_price
0        23496     a1  2017-06-27          55.0
1        23496     e1  2017-08-03         315.0
2        23084     a1  2017-01-12         255.0
3        23084     e1         NaN           NaN

Answer 2

这可以做到：

pd.concat([agg_df.add_prefix(index+'_').reset_index() 
           for index,agg_df  in df.groupby('source', as_index=False)],
           axis=1)

说明：

根据soure：df.groupby('source', as_index=False)
遍历这些组for index,agg_df ...
为每个组添加源值作为前缀和reset_index：agg_df.add_prefix(index+'_').reset_index()
最后，将所有组连接回一个数据帧：pd.concat([...])

结果：

In [46]: pd.concat([agg_df.add_prefix(index+'_').reset_index() 
    ...:            for index,agg_df  in df.groupby('source', as_index=False)],
    ...:            axis=1)  
Out[46]: 
   index a1_part_number a1_source a1_recent_date a1_recent_price  index  \
0      0        0023496        a1     2017-06-27            55.0      1   
1      2        0023084        a1     2017-01-12           255.0      3   

  e1_part_number e1_source e1_recent_date e1_recent_price  
0        0023496        e1     2017-08-03           315.0  
1        0023084        e1            NaN             NaN  

In [47]:

Pandas - 根据列中的值创建列名

2 个答案:

说明：