从具有lista的嵌套json文件创建pandas数据帧

时间:2018-01-04 09:38:24

标签: python json pandas dataframe

a picture on how the data look like

所以,我有一个带有数据的json文件,该文件实际上是嵌套的,我只想单词并为每个帖子id创建一个新的数据帧。任何人都可以帮忙吗?

1 个答案:

答案 0 :(得分:0)

您可以apply使用list comprehension

df = pd.DataFrame({'member_info.vocabulary':[[], [{'post_iD':'3913', 'word':'Twisters'},
                                                  {'post_iD':'3911', 'word':'articulate'}]]})

df['words'] = df['member_info.vocabulary'].apply(lambda x: [y.get('word') for y in x])
print (df)

                              member_info.vocabulary                   words
0                                                 []                      []
1  [{'post_iD': '3913', 'word': 'Twisters'}, {'po...  [Twisters, articulate]

如果获取一个元素列表,则仅为列表的选择第一个值添加str[0]

df = pd.DataFrame({'member_info.vocabulary':[[], [{'post_iD':'3913', 'word':'Twisters'}]]})

df['words'] = df['member_info.vocabulary'].apply(lambda x: [y.get('word') for y in x]).str[0]
print (df)

                      member_info.vocabulary     words
0                                         []       NaN
1  [{'post_iD': '3913', 'word': 'Twisters'}]  Twisters