我通过流媒体收集txt文件中的twitter数据,我使用此文件进行过滤和使用ipython notebook进行各种查询。我发现,有时当我有一个繁重的数据文件时,命令会卡在“文本”附近。 twitter数据中的一个类别。我需要处理数据的方式,以便我不会卡住。我正在粘贴发生的事情。
tweets_ISIS = tweets['text'].apply(lambda tweet: word_in_text('ISIS', tweets))
这是输出:
AttributeError Traceback (most recent call last)
<ipython-input-34-444b712d99dc> in <module>()
----> 1 tweets_ISIS = tweets['text'].apply(lambda tweet: word_in_text('ISIS', tweets))
/usr/lib64/python2.7/site-packages/pandas/core/series.pyc in apply(self, func, convert_dtype, args, **kwds)
2167 values = lib.map_infer(values, lib.Timestamp)
2168
-> 2169 mapped = lib.map_infer(values, f, convert=convert_dtype)
2170 if len(mapped) and isinstance(mapped[0], Series):
2171 from pandas.core.frame import DataFrame
pandas/src/inference.pyx in pandas.lib.map_infer (pandas/lib.c:62578)()
<ipython-input-34-444b712d99dc> in <lambda>(tweet)
----> 1 tweets_ISIS = tweets['text'].apply(lambda tweet: word_in_text('ISIS', tweets))
<ipython-input-33-0ee00dabf341> in word_in_text(word, text)
1 def word_in_text(word, text):
2 word = word.lower()
----> 3 text = text.lower()
4 match = re.search(word, text)
5 if match:
/usr/lib64/python2.7/site-packages/pandas/core/generic.pyc in __getattr__(self, name)
2358 return self[name]
2359 raise AttributeError("'%s' object has no attribute '%s'" %
-> 2360 (type(self).__name__, name))
2361
2362 def __setattr__(self, name, value):
AttributeError: 'DataFrame' object has no attribute 'lower
我定义如下import re
:
def word_in_text(word, text):
word = word.lower()
text = text.lower()
match = re.search(word, text)
if match:
return True
return False
答案 0 :(得分:0)
我定义如下import re
def word_in_text(word, text):
word = word.lower()
text = text.lower()
match = re.search(word, text)
if match:
return True
return False