我正在对几种语言进行情感分析。我的代码运行成功,但是速度非常慢(仅11K条记录就达到1000万)。这是我的代码:
# Spanish Classifier - from https://github.com/aylliote/senti-py
clf = SentimentClassifier()
# Italian Classifier - Also for Russian
from polyglot.text import Text as T
# Germany Classifier
from textblob_de import TextBlobDE as TextBlob_d
# English
from textblob import TextBlob
# French
from textblob_fr import PatternTagger, PatternAnalyzer
def Flag(row):
try:
if row['lang'] == 'es':
txt=clf.predict(row['rev'])
return txt
elif row['lang'] == 'it':
txt=T(row['rev'])
return txt.polarity
elif row['lang'] == 'de':
txt=TextBlob_d(row['rev'])
return txt.sentiment
elif row['lang'] == 'en':
txt=TextBlob(row['rev'])
return txt.sentiment.polarity
elif row['lang'] == 'fr':
txt=TextBlob(row['rev'], pos_tagger=PatternTagger(),
analyzer=PatternAnalyzer())
return txt.sentiment[0]
elif row['lang'] == 'ru':
txt=T(row['rev'])
return txt.polarity
else:
return ""
except:
return ""
df['sent']=df.apply(Flag,axis=1)
我检查了有关textblob.sentiments导入NaiveBayesAnalyzer的其他帖子,速度非常慢,但是我不认为这是我在这里遇到的同样情况?
谢谢