基于列中的列表项拆分数据框

时间:2020-12-31 16:33:58

标签: python pandas dataframe

我想根据列中的列表项拆分数据框

               Name     Currency_Pair
0         Currency1     [ccy_UK_GBX=Minor, ccy_UK_USD=Foreign]
1         Currency2     []
2         Currency3     [ccy_UK_GBP=Major]
4         Currency4     []

输出数据帧:

               Name     Country CCY     Denom
0         Currency1     UK      GBX     Minor
1         Currency1     UK      USD     Foreign
2         Currency2     NaN     NaN     NaN
3         Currency3     UK      GBP     Major
4         Currency4     NaN     NaN     NaN

我如何实现这一目标?

2 个答案:

答案 0 :(得分:4)

考虑df

In [238]: df = pd.DataFrame({'Name':['Currency1', 'Currency2', 'Currency3', 'Currency4'], 'Currency_Pair':[['ccy_UK_GBX=Minor', 'ccy_UK_USD=Foreign'], [], ['ccy_UK_GBP=Major'], []]})

In [239]: df
Out[239]: 
        Name                           Currency_Pair
0  Currency1  [ccy_UK_GBX=Minor, ccy_UK_USD=Foreign]
1  Currency2                                      []
2  Currency3                      [ccy_UK_GBP=Major]
3  Currency4                                      []

df.explodeSeries.str.split 一起使用:

In [242]: df = df.explode('Currency_Pair')

In [244]: df['Country'] = df.Currency_Pair.str.split('_').str[1]

In [245]: df['CCY'] = df.Currency_Pair.str.split('_').str[2].str.split('=').str[0]

In [246]: df['Denom'] = df.Currency_Pair.str.split('=').str[-1]

In [247]: df
Out[247]: 
        Name       Currency_Pair Country  CCY    Denom
0  Currency1    ccy_UK_GBX=Minor      UK  GBX    Minor
0  Currency1  ccy_UK_USD=Foreign      UK  USD  Foreign
1  Currency2                 NaN     NaN  NaN      NaN
2  Currency3    ccy_UK_GBP=Major      UK  GBP    Major
3  Currency4                 NaN     NaN  NaN      NaN

答案 1 :(得分:1)

Series.str.split'_|='expand=True

df2 = df.explode('Currency_Pair').reset_index(drop=True)
new_df = (df2.join(df2['Currency_Pair'].str.split('_|=', expand=True)
                                       .loc[:, 1:]
                                       .set_axis(['Country', 'CCY', 'Denom'],  
                                                 axis=1))
             .drop('Currency_Pair', axis=1))

输出

print(new_df)

        Name Country  CCY    Denom
0  Currency1      UK  GBX    Minor
1  Currency1      UK  USD  Foreign
2  Currency2     NaN  NaN      NaN
3  Currency3      UK  GBP    Major
4  Currency4     NaN  NaN      NaN