我想根据列中的列表项拆分数据框
Name Currency_Pair
0 Currency1 [ccy_UK_GBX=Minor, ccy_UK_USD=Foreign]
1 Currency2 []
2 Currency3 [ccy_UK_GBP=Major]
4 Currency4 []
输出数据帧:
Name Country CCY Denom
0 Currency1 UK GBX Minor
1 Currency1 UK USD Foreign
2 Currency2 NaN NaN NaN
3 Currency3 UK GBP Major
4 Currency4 NaN NaN NaN
我如何实现这一目标?
答案 0 :(得分:4)
考虑df
:
In [238]: df = pd.DataFrame({'Name':['Currency1', 'Currency2', 'Currency3', 'Currency4'], 'Currency_Pair':[['ccy_UK_GBX=Minor', 'ccy_UK_USD=Foreign'], [], ['ccy_UK_GBP=Major'], []]})
In [239]: df
Out[239]:
Name Currency_Pair
0 Currency1 [ccy_UK_GBX=Minor, ccy_UK_USD=Foreign]
1 Currency2 []
2 Currency3 [ccy_UK_GBP=Major]
3 Currency4 []
将 df.explode
与 Series.str.split
一起使用:
In [242]: df = df.explode('Currency_Pair')
In [244]: df['Country'] = df.Currency_Pair.str.split('_').str[1]
In [245]: df['CCY'] = df.Currency_Pair.str.split('_').str[2].str.split('=').str[0]
In [246]: df['Denom'] = df.Currency_Pair.str.split('=').str[-1]
In [247]: df
Out[247]:
Name Currency_Pair Country CCY Denom
0 Currency1 ccy_UK_GBX=Minor UK GBX Minor
0 Currency1 ccy_UK_USD=Foreign UK USD Foreign
1 Currency2 NaN NaN NaN NaN
2 Currency3 ccy_UK_GBP=Major UK GBP Major
3 Currency4 NaN NaN NaN NaN
答案 1 :(得分:1)
Series.str.split
与 '_|='
和 expand=True
df2 = df.explode('Currency_Pair').reset_index(drop=True)
new_df = (df2.join(df2['Currency_Pair'].str.split('_|=', expand=True)
.loc[:, 1:]
.set_axis(['Country', 'CCY', 'Denom'],
axis=1))
.drop('Currency_Pair', axis=1))
输出
print(new_df)
Name Country CCY Denom
0 Currency1 UK GBX Minor
1 Currency1 UK USD Foreign
2 Currency2 NaN NaN NaN
3 Currency3 UK GBP Major
4 Currency4 NaN NaN NaN