以下格式的数据帧必须像“ op_df”那样转换,
ip_df=pd.DataFrame({'class':['I','II','III'],'details':[[{'sec':'A','assigned_to':'tom'},{'sec':'B','assigned_to':'sam'}],[{'sec':'B','assigned_to':'joe'}],[]]})
ip_df:
class details
0 I [{'sec':'A','assigned_to':'tom'},{'sec':'B','assigned_to':'sam'}]
1 II [{'sec':'B','assigned_to':'joe'}]
2 III []
假定所需的输出数据帧为
op_df:
class sec assigned_to
0 I A tom
1 I B sam
2 II B joe
3 III NaN NaN
如何将“详细信息”列的每个字典更改为新行,并以字典的键作为列名和字典的值作为其各自的列值?
我尝试过,
ip_df.join(ip_df['details'].apply(pd.Series))
而我无法像“ op_df”那样构图。
答案 0 :(得分:1)
我确信有更好的方法,但是我必须解构您的详细信息列表,并按如下所示创建数据框:
dict_values = {'class':['I','II','III'],'details':[[{'sec':'A','assigned_to':'tom'},{'sec':'B','assigned_to':'sam'}],[{'sec':'B','assigned_to':'joe'}],[]]}
all_values = []
for cl, detail in zip(dict_values['class'], dict_values['details']):
if len(detail) > 0:
for innerdict in detail:
row = {'class': cl}
for innerkey in innerdict.keys():
row[innerkey] = innerdict[innerkey]
all_values.append(row)
else:
row = {'class': cl}
all_values.append(row)
op_df = pd.DataFrame(all_values)