将NLTK Rake应用于数据帧中的每一行

时间:2019-07-01 13:21:28

标签: python pandas nltk

我想将Rake函数(link)应用于数据框中的每一行。

我可以将函数分别应用于特定行,但不能将其附加到数据框。

这是我到目前为止所拥有的:

c6

它只是无限期运行。我只想在数据框“测试”中添加一个名为“关键字”的新列,其中包含r.get_ranked_phrases()中的关键字列表。

1 个答案:

答案 0 :(得分:0)

r.extract_keywords_from_text(x)将返回无

import pandas as pd
from  rake_nltk import Rake  

r = Rake()    

df=pd.DataFrame(data = ['machine learning and fraud detection are a must learn',
                  'monte carlo method is great and so is hmm,pca, svm and neural net',
                  'clustering and cloud',
                  'logistical regression and data management and fraud detection'] ,columns = ['Comments'])


 def rake_implement(x,r):
     r.extract_keywords_from_text(x)
     return r.get_ranked_phrases()

df['new_col'] =df['Comments'].apply(lambda x: rake_implement(x,r))
print(df['new_col'])
#o/p
0      [must learn, machine learning, fraud detection]
1    [monte carlo method, neural net, svm, pca, hmm...
2                                  [clustering, cloud]
3    [logistical regression, fraud detection, data ...
Name: new_col, dtype: object