在ipython中映射关键字的DataFrame时出错

时间:2016-04-16 00:54:30

标签: dictionary twitter lambda dataframe ipython

我收集了推文并希望添加一个名为taliban的列,用于映射(真/假)推文中出现的单词(塔利班/塔利班)。请查看代码及其错误。

tweets = pd.DataFrame()
from pandas.io.json import json_normalize    
tweets = json_normalize(tweet_data)[["text", "lang", "place.country","created_at", \
 "coordinates", "user.location"]]

定义单词的代码如下:

#definition for collecting keyword.
 def word_in_text(word, text):
word = word.lower()
text = text.lower()
match = re.search(word, text)
if match:
    return True
return False

然后我运行这个查询:

tweets['Taliban'] = tweets['text'].apply(lambda tweet: word_in_text('Taliban', tweet))

我收到以下错误:

  AttributeError                            Traceback (most recent call last)
  <ipython-input-16-17bcf7fbf866> in <module>()
 ----> 1 tweets['Taliban'] = tweets['text'].apply(lambda tweet: word_in_text('Taliban', tweet))

   /usr/lib64/python2.7/site-packages/pandas/core/series.pyc in apply(self, func, convert_dtype, args, **kwds)
    2167             values = lib.map_infer(values, lib.Timestamp)
    2168 
-> 2169         mapped = lib.map_infer(values, f, convert=convert_dtype)
  2170         if len(mapped) and isinstance(mapped[0], Series):
   2171             from pandas.core.frame import DataFrame

 pandas/src/inference.pyx in pandas.lib.map_infer (pandas/lib.c:62578)()

 <ipython-input-16-17bcf7fbf866> in <lambda>(tweet)
  ----> 1 tweets['Taliban'] = tweets['text'].apply(lambda tweet:  word_in_text('Taliban', tweet))

  <ipython-input-15-56228996ad5c> in word_in_text(word, text)
         2 def word_in_text(word, text):
         3     word = word.lower()
   ----> 4     text = text.lower()
        5     match = re.search(word, text)
          6     if match:

     AttributeError: 'float' object has no attribute 'lower'

       In [ ]:

我尝试过这个修改:

#definition for collecting keyword.
 def word_in_text(word, text):
word = word.lower()
if type(text) is str:
    return text.lower()

    match = re.search(word, text)
if match:
    return True
return False

用于:

tweets['taliban'] = tweets['text'].apply(lambda tweet: word_in_text('taliban', tweet))

但是给出的错误是:

UnboundLocalError: local variable 'match' referenced before assignment

0 个答案:

没有答案