Python Pandas:计算数据框中的单词数

时间:2016-07-11 01:26:28

标签: pandas dataframe python-3.5

有一个大数据框架名称dataframe1。例如(只是几个):

Portable.DataAnnotations

我想计算每个名称文本的单词数(例如dataframe1中所有screen1' s文本)使用函数count_noun(str)。此外,con_noun(str)还可以完成。

我想提取数据框中具有相同名称的所有文本并计算名词计数。请不要专注于函数count_noun(str),我已经完成了它。

我的代码:

 date                  text                             name
 1      I like you hair, do you like it              screen1
 2      beautiful sun and wind                       screen2
 3      today is happy, I want to got school         screen3
 4      good movie                                   screen4
 5      thanks god                                   screen1
 6      you are my son and I love you                screen2
 7      the company  is good                         screen1
 8      no one can help me, only you                 screen2
 9      the book is good and I read it everyday      screen3
 10      water is the source of love                 screen4
 11     I like you hair, do you like it              screen1
 12     my love man is leaving                       screen2

我发现这是错误的并且不知道如何解决它,例如将所有name1的文本提取为字符串并将其发送到函数:noun_count(str),请给我你的手, 谢谢!

1 个答案:

答案 0 :(得分:1)

我已经解决了,使用apply()函数来计算

import pandas as pd
import numpy as np

screen_name_unique = list(set(dataframe1['name']))
for name in screen_name_unique:
  dataframe_text = dataframe1[dataframe1.name == name]
  dataframe_text['text'].apply(noun_count)



def noun_count (str):
  words_len = len(str)
  return words_len