如何在python中检测连续输入的twitter表情符号

时间:2016-02-26 20:30:55

标签: python twitter nltk

我在windows和NLTK包中使用python 2.7。

我想在tweet中检测twitter表情符号(表情符号)。此代码能够检测特定的表情符号,条件是表情符号不会立即与另一个表情符号一起继续。

from nltk.tokenize import TweetTokenizer,  word_tokenize
def negationDetection(tweet):
   words = word_tokenize(tweet)
   print words
   emoticonList = ['❤']
   i=0
   for word in words:
      if word in emoticonList:
          print "detected"
      i+=1
a = "Congratulations to my cousin James Daniel Brown for graduating from Texas Tech University ❤ go red raiders"
b = negationDetection(a)

结果是>>>检测

但如果推文是a = "Congratulations to my cousin James Daniel Brown for graduating from Texas Tech University ❤❤❤ go red raiders"

它没有检测到推文中的表情符号。我该如何处理这种情况?是否有任何paackage / tokenizer可以解决我的问题?

提前致谢

1 个答案:

答案 0 :(得分:0)

这是我找到的解决方案

def negationDetection(tweet):
    emoticonList = ['❤']
    for i in emoticonList:
        if tweet.find(i) == -1:
            continue
        else:
            print "Found", i


a = "It's a good thing I didn't apply anywhere else.. American University 2020 ❤️❤️ URL"
b = negationDetection(a)

现在它完美无缺。