当字的零发生时出错

时间:2016-07-15 22:14:58

标签: python pandas twitter tweepy

首先,抱歉我的英语不好。

我正在使用此代码来计算“LeBron”或“Curry”字样出现在推文上的次数。问题是如果没有推文包含单词“LeBron”或“Curry”,程序就会崩溃。这些单词是否存在,程序运行完美。

tweets_data_path = '/Users/HCruz/NetBeansProjects/elections3/data/fetched_tweets.txt'

tweets_data = []
tweets_file = open(tweets_data_path, "r")
for line in tweets_file:
    try:
        tweet = json.loads(line)
        tweets_data.append(tweet)
    except:
        continue

tweets = pd.DataFrame()

tweets['text'] = map(lambda tweet: tweet['text'], tweets_data)

def word_in_text(word, text):
    word = word.lower()
    text = text.lower()
    match = re.search(word, text)
    if match:
        return True
        return False

tweets['LeBron'] = tweets['text'].apply(lambda tweet: word_in_text('LeBron', tweet))
tweets['Curry'] = tweets['text'].apply(lambda tweet: word_in_text('Curry', tweet))

LeBron = tweets['LeBron'].value_counts()[True]
Curry = tweets['Curry'].value_counts()[True]

print("LeBron %s" % LeBron)
print("Curry %s" % Curry)

当至少有一个,“咖喱”或“勒布朗”时,我得到了这个:

Processing...
LeBron 1
Curry 34

多数民众赞成。

但如果我删除“LeBron”,那么LeBron没有出现,程序会崩溃。

Hectors-iMac:src HCruz$ python process_tweets.py
Processing...
Traceback (most recent call last):
  File "process_tweets.py", line 80, in <module>
    s.run()
  File     "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/sched.py", line 117, in run
action(*argument)
  File "process_tweets.py", line 54, in processing
    process_tweets()
  File "process_tweets.py", line 44, in process_tweets
LeBron = tweets['LeBron'].value_counts()[True]
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/series.py", line 491, in __getitem__
result = self.index.get_value(self, key)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/index.py", line 1038, in get_value
return tslib.get_value_box(s, key)
  File "tslib.pyx", line 454, in pandas.tslib.get_value_box (pandas/tslib.c:9561)
  File "tslib.pyx", line 469, in pandas.tslib.get_value_box (pandas/tslib.c:9408)
IndexError: index out of bounds

1 个答案:

答案 0 :(得分:2)

通过使用try / catch:

包围第44行的代码来使用异常处理
try:
    LeBron = tweets['LeBron'].value_counts()[True]
except IndexError:
    LeBron = 0