使用nltk.classify.apply_features时出错

时间:2015-01-21 21:41:37

标签: python nltk sentiment-analysis

我是新的NLTK Python库并使用它编写我的第一个程序。该计划是关于推文的分析。下面的get_feature函数返回功能(这是一个字典)。使用这些功能,我必须创建一个训练集。但是在训练集中,当我传递get_feature函数和tweets(元组列表)时,它给出了错误,因为'dict'对象不可调用'。 创建训练集时存在一些问题。

我是否将正确类型的值传递给nltk.classify.apply_features()

我尝试使用nltk.classify.util.apply_features()但没有用。有人可以告诉我,因为我出错了。

Error:
Traceback (most recent call last):
  File "C:/Users/sanjiv/PycharmProjects/NLP/twitterAnalysis.py", line 98, in <module>
    ta.train_set(store_features,ta.twitter())
  File "C:/Users/sanjiv/PycharmProjects/NLP/twitterAnalysis.py", line 88, in train_set
    print train_set
  File "C:\Python27\lib\site-packages\nltk\compat.py", line 487, in wrapper
    return method(self).encode('ascii', 'backslashreplace')
  File "C:\Python27\lib\site-packages\nltk\compat.py", line 475, in wrapper
    return transliterate(method(self))
  File "C:\Python27\lib\site-packages\nltk\compat.py", line 487, in wrapper
    return method(self).encode('ascii', 'backslashreplace')
  File "C:\Python27\lib\site-packages\nltk\util.py", line 664, in __repr__
    for elt in self:
  File "C:\Python27\lib\site-packages\nltk\util.py", line 845, in iterate_from
    try: yield self._func(self._lists[0][index])
  File "C:\Python27\lib\site-packages\nltk\classify\util.py", line 65, in lazy_func
    return (feature_func(labeled_token[0]), labeled_token[1])
TypeError: 'dict' object is not callable

代码是:

import nltk
import operator
from nltk.classify.util import apply_features

class TwitterAnalysis:
def __init__(self):
    pass

    @staticmethod
    def twitter():
        tweets = []
        tweets_word = []
        new_words = []
        # Positive tweets
        pos_tweets = [('I love this movie', 'positive'), ('This view is amazing', 'positive'),
                      ('Loving this morning', 'positive'),
                      ('he is my best friend', 'positive')]

        #Negative tweets
        neg_tweets = [('I do not like this car', 'negative'), ('This view is horrible', 'negative'),
                      ('I am feeling lazy this morning', 'negative'), ('He is my enemy', 'negative')]

        #Combining both type of tweets into single list and eliminating words less then size of 2

        combined_lst = [pos_tweets + neg_tweets]
        #list comprehension

        for (words, sentiment) in pos_tweets + neg_tweets:
            new_words = []

            for each in words.split():
                if len(each) >= 3:
                    new_words.append(each.lower())
            #print new_words
            tweets.append((new_words, sentiment))
        #contains tweet with sentiment
        return tweets


    def get_features(self, document, all_words,tweets):
        document_features = set(document)
        features = {}
        for each in all_words:
            features['contains(%s)' % each] = (each in document_features)
        #feature set
        return features

    def train_set(self,get_features,tweets):
        train_set = nltk.classify.util.apply_features(get_features,tweets)
        print train_set

1 个答案:

答案 0 :(得分:0)

这是来自nltk.classify.util.apply_features源代码(阅读源通常有帮助!)

  """
def apply_features(feature_func, toks, labeled=None):
    Use the ``LazyMap`` class to construct a lazy list-like
    object that is analogous to ``map(feature_func, toks)``.  In
    particular, if ``labeled=False``, then the returned list-like
    object's values are equal to::

        [feature_func(tok) for tok in toks]

    If ``labeled=True``, then the returned list-like object's values
    are equal to::

        [(feature_func(tok), label) for (tok, label) in toks]"""

您注意到您实施的get_features功能需要3个输入,但apply_features只接受2个(标记为=真/无)。

我并不确切知道您尝试提取的结果,但您可能希望将一些默认参数传递给get_features或让它调用self方法。