熊猫将列表列表转换为列名称并附加值

时间:2020-08-06 18:12:58

标签: python pandas dataframe data-cleaning

我必须在pandas数据框中输入列,第二列是键,第二列是值,其中两个都是列表列表。

赞:

import pandas as pd 
example = pd.DataFrame( {'col1': [['key1','key2','key3'],['key1','key4'],['key1', 'key3', 'key4','key5']], 'col2': [['value1','value2','value3'], ['value1','value4'], ['value1', 'value3', 'value4','value5']]  }) 
print(example)
    col1    col2
0   [key1, key2, key3]  [value1, value2, value3]
1   [key1, key4]    [value1, value4]
2   [key1, key3, key4, key5]    [value1, value3, value4, value5]

首先,我想将所有可能的键转换为列,然后将值附加到它们。 最终结果应该像这样

    key1      key2    key3     key4    key5
0   value1    value2  value3   NaN     NaN
1   value1    NaN     NaN      value4  NaN
2   value1    NaN     value3   value4  value5
        

1 个答案:

答案 0 :(得分:4)

尝试使用explode并重塑数据框。

df_new = example.apply(pd.Series.explode)    
df_new.set_index('col1', append=True).unstack()

输出:

col1    key1    key2    key3    key4    key5
0     value1  value2  value3     NaN     NaN
1     value1     NaN     NaN  value4     NaN
2     value1     NaN  value3  value4  value5