Twitter数据无法从文本文件中解析

时间:2016-04-11 03:36:58

标签: python pandas twitter

我通过流媒体收集txt文件中的twitter数据,我使用此文件进行过滤和使用ipython notebook进行各种查询。我发现,有时当我有一个繁重的数据文件时,命令会卡在“文本”附近。 twitter数据中的一个类别。我需要处理数据的方式,以便我不会卡住。我正在粘贴发生的事情。

tweets_ISIS = tweets['text'].apply(lambda tweet: word_in_text('ISIS', tweets))

这是输出:

AttributeError                            Traceback (most recent call   last)
 <ipython-input-34-444b712d99dc> in <module>()
 ----> 1 tweets_ISIS = tweets['text'].apply(lambda tweet: word_in_text('ISIS', tweets))

/usr/lib64/python2.7/site-packages/pandas/core/series.pyc in apply(self, func, convert_dtype, args, **kwds)
    2167             values = lib.map_infer(values, lib.Timestamp)
    2168 
  -> 2169         mapped = lib.map_infer(values, f, convert=convert_dtype)
      2170         if len(mapped) and isinstance(mapped[0], Series):
      2171             from pandas.core.frame import DataFrame

   pandas/src/inference.pyx in pandas.lib.map_infer (pandas/lib.c:62578)()

     <ipython-input-34-444b712d99dc> in <lambda>(tweet)
          ----> 1 tweets_ISIS = tweets['text'].apply(lambda tweet: word_in_text('ISIS', tweets))

      <ipython-input-33-0ee00dabf341> in word_in_text(word, text)
         1 def word_in_text(word, text):
       2     word = word.lower()
   ----> 3     text = text.lower()
      4     match = re.search(word, text)
        5     if match:

    /usr/lib64/python2.7/site-packages/pandas/core/generic.pyc in            __getattr__(self, name)
     2358                 return self[name]
    2359             raise AttributeError("'%s' object has no attribute '%s'" %
 -> 2360                                  (type(self).__name__, name))
    2361 
     2362     def __setattr__(self, name, value):

   AttributeError: 'DataFrame' object has no attribute 'lower

我定义如下import re

    def word_in_text(word, text):
        word = word.lower()
          text = text.lower()
         match = re.search(word, text)
         if match:
            return True
             return False

1 个答案:

答案 0 :(得分:0)

我定义如下import re

    def word_in_text(word, text):
        word = word.lower()
          text = text.lower()
         match = re.search(word, text)
         if match:
            return True
             return False