我想在将其嵌套字典输出到csv之前重新格式化它。 我的嵌套字典:
step_list
到目前为止,我已经尝试过:
review = {'Q1': {'Question': 'question wording','Answer': {'Part 1': 'Answer part one', 'Part 2': 'Answer part 2'} ,'Proof': {'Part 1': 'The proof part one', 'Part 2': 'The proof part 2'}},
'Q2': {'Question': 'question wording','Answer': {'Part 1': 'Answer part one', 'Part 2': 'Answer part 2'} ,'Proof': {'Part 1': 'The proof part one', 'Part 2': 'The proof part 2'}}}
并获得帮助:
my_df = pd.DataFrame(review)
my_df = my_df.unstack()
但我希望它最终看起来像这样:
Q1 Answer {'Part 1': 'Answer part one', 'Part 2': 'Answe...
Proof {'Part 1': 'The proof part one', 'Part 2': 'Th...
Question question wording
Q2 Answer {'Part 1': 'Answer part one', 'Part 2': 'Answe...
Proof {'Part 1': 'The proof part one', 'Part 2': 'Th...
Question question wording
所以我需要熔化/解开/枢轴/展开/ other_manipulation_word数据框中的嵌套字典。
我已将此作为指导,但无法将其应用于自己的指导: Expand pandas dataframe column of dict into dataframe columns
答案 0 :(得分:2)
这是一种可能的解决方案:
1)使用东方“索引”创建初始DataFrame
df = pd.DataFrame.from_dict(review, orient='index')
2)使用Index.repeat
,Series.str.len
和DataFrame.loc
DataFrame
的形状
df_new = df.loc[df.index.repeat(df.Answer.str.len())]
3)通过传递给DataFrame
的构造函数并使用stack
的值来修复“答案”和“证明”列
df_new['Answer'] = pd.DataFrame(df.Answer.tolist()).stack().values
df_new['Proof'] = pd.DataFrame(df.Proof.tolist()).stack().values
print(df_new)
Question Answer Proof
Q1 question wording Answer part one The proof part one
Q1 question wording Answer part 2 The proof part 2
Q2 question wording Answer part one The proof part one
Q2 question wording Answer part 2 The proof part 2