将df列的相同值分组为单个变量

时间:2017-07-12 23:25:38

标签: python dataframe group-by

我从在线抓取的数据中合并了两个dfs;

merge_data = pd.merge(WikiData,SPData, on='Symbol')
merge_data.set_index('Symbol',inplace=True)
merge_data.head()

并获得以下df:

        Sector      Sub-industry    Company     Weight
Symbol              
MMM    Industrials  Conglomerates   MCompany    0.602676
ABT    Health Care  Equipment       Abbott Lab  0.401900
ABBV   Health Care  Pharmaceuticals AbbVie Inc  0.550174
ACN    Info Tech    Consulting      Accenture   0.370650
ATVI   Info Tech    Entertainment   Activision  0.192788

如何将“扇区”列的相同值组合在一起?例如,我希望“医疗保健”行业中的所有股票=“XLV”变量,“Info Tech”中的所有股票=“XLK”

1 个答案:

答案 0 :(得分:0)

创建一个新列并应用一个字典,该字典具有每个新标签的键值

labels = { 'Health Care': 'XLV', 'Info Tech': 'XLK' }

merge_data['new_label'] = merge_data['Sector'].apply(lambda sector: labels[sector])