将行,列值转换为dict和数据帧pandas

时间:2016-11-18 02:19:44

标签: python-3.x pandas dictionary dataframe series

python noob here。

我的数据框people包含nametext两列。

  name       text
0 Obama      Obama was the 44th president of the...
1 Trump      Donald J. Trump ran as a republican...

我需要仅对Obama进行一些探索性分析。

obama= people[people['name'] == 'Obama'].copy()
obama.text

35817    Obama was the 44th president of the unit...
Name: text, dtype: object

如何将文本转换为dict作为新列,其中键为单词,单词数为值?
示例:

   name       text                                  dictionary
0 Obama      Obama was the 44th president of the... {'Obama':1, 'the':2,...}

完成后,如何将字典转换为单独的数据帧? 预期:

   word   count
0  Obama  1
1  the    2

1 个答案:

答案 0 :(得分:0)

您可以使用集合模块中的Counter对象:

import collections

people['dictionary'] = people.text.apply(lambda x: dict(collections.Counter(x.split())))

将其中一个词典转换为数据框:

dictionary = people['dictionary'][0]
pd.DataFrame(data={'word': dictionary.keys(), 'count': dictionary.values()})