a picture on how the data look like
所以,我有一个带有数据的json文件,该文件实际上是嵌套的,我只想单词并为每个帖子id创建一个新的数据帧。任何人都可以帮忙吗?
答案 0 :(得分:0)
您可以apply
使用list comprehension
:
df = pd.DataFrame({'member_info.vocabulary':[[], [{'post_iD':'3913', 'word':'Twisters'},
{'post_iD':'3911', 'word':'articulate'}]]})
df['words'] = df['member_info.vocabulary'].apply(lambda x: [y.get('word') for y in x])
print (df)
member_info.vocabulary words
0 [] []
1 [{'post_iD': '3913', 'word': 'Twisters'}, {'po... [Twisters, articulate]
如果获取一个元素列表,则仅为列表的选择第一个值添加str[0]
:
df = pd.DataFrame({'member_info.vocabulary':[[], [{'post_iD':'3913', 'word':'Twisters'}]]})
df['words'] = df['member_info.vocabulary'].apply(lambda x: [y.get('word') for y in x]).str[0]
print (df)
member_info.vocabulary words
0 [] NaN
1 [{'post_iD': '3913', 'word': 'Twisters'}] Twisters