非常感谢您的帮助。我有一个看起来像这样的熊猫数据框:
index source timestamp value
1 car 1 ['98']
2 bike 2 ['98', 100']
3 car 3 ['65']
4 bike 4 ['100', '120']
5 plane 5 ['20' , '12', '30']
我需要的是将“值” Panda系列中的每个值转换为新列。因此输出如下:
index source timestamp car bike1 bike2 plane1 plane2 plane3
1 car 1 98 Na Na Na Na Na
2 bike 2 Na 98 100 Na Na Na
3 car 3 65 Na Na Na Na Na
4 bike 4 Na 100 120 Na Na Na
5 plane 5 Na Na Na 20 12 30
对于汽车,数组的大小将始终为1,对于自行车2和飞机3而言,这将转换为我在新数据框中需要的新列数。实现此目标的最佳方法是什么?
答案 0 :(得分:1)
首先将值转换为列表:
new Date()
然后为每行创建字典:
Keep the blank __init__.py file inside the db folder.
创建import ast
df['value'] = df['value'].apply(ast.literal_eval)
并加入原始df:
L = [{f'{i}{x+1}':y for x, y in enumerate(j)} for i, j in zip(df['source'], df['value'])]
print (L)
[{'car1': '98'},
{'bike1': '98', 'bike2': '100'},
{'car1': '65'},
{'bike1': '100', 'bike2': '120'},
{'plane1': '20', 'plane2': '12', 'plane3': '30'}]