我收集了推文并希望添加一个名为taliban的列,用于映射(真/假)推文中出现的单词(塔利班/塔利班)。请查看代码及其错误。
tweets = pd.DataFrame()
from pandas.io.json import json_normalize
tweets = json_normalize(tweet_data)[["text", "lang", "place.country","created_at", \
"coordinates", "user.location"]]
定义单词的代码如下:
#definition for collecting keyword.
def word_in_text(word, text):
word = word.lower()
text = text.lower()
match = re.search(word, text)
if match:
return True
return False
然后我运行这个查询:
tweets['Taliban'] = tweets['text'].apply(lambda tweet: word_in_text('Taliban', tweet))
我收到以下错误:
AttributeError Traceback (most recent call last)
<ipython-input-16-17bcf7fbf866> in <module>()
----> 1 tweets['Taliban'] = tweets['text'].apply(lambda tweet: word_in_text('Taliban', tweet))
/usr/lib64/python2.7/site-packages/pandas/core/series.pyc in apply(self, func, convert_dtype, args, **kwds)
2167 values = lib.map_infer(values, lib.Timestamp)
2168
-> 2169 mapped = lib.map_infer(values, f, convert=convert_dtype)
2170 if len(mapped) and isinstance(mapped[0], Series):
2171 from pandas.core.frame import DataFrame
pandas/src/inference.pyx in pandas.lib.map_infer (pandas/lib.c:62578)()
<ipython-input-16-17bcf7fbf866> in <lambda>(tweet)
----> 1 tweets['Taliban'] = tweets['text'].apply(lambda tweet: word_in_text('Taliban', tweet))
<ipython-input-15-56228996ad5c> in word_in_text(word, text)
2 def word_in_text(word, text):
3 word = word.lower()
----> 4 text = text.lower()
5 match = re.search(word, text)
6 if match:
AttributeError: 'float' object has no attribute 'lower'
In [ ]:
我尝试过这个修改:
#definition for collecting keyword.
def word_in_text(word, text):
word = word.lower()
if type(text) is str:
return text.lower()
match = re.search(word, text)
if match:
return True
return False
用于:
tweets['taliban'] = tweets['text'].apply(lambda tweet: word_in_text('taliban', tweet))
但是给出的错误是:
UnboundLocalError: local variable 'match' referenced before assignment