我正在尝试使用Python
和IBM Watson Tone Analyzer
构建情感分析项目。
首先,我使用的是TextBlob
,但是现在我必须在同一代码中使用IBM Watson。
当我使用TextBlob执行该代码时,它运行良好,但是对于IBM Watson,我遇到了一些问题,因为我是机器学习的新手,所以我根本不知道它是什么。
#Import library
import tweepy
import pandas as pd
import numpy as np
from watson_developer_cloud import ToneAnalyzerV3
from watson_developer_cloud.tone_analyzer_v3 import ToneInput
#for plotting and virtualization
from IPython.display import display
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
#Twitter app access keys for @user
#Consume
CONSUMER_KEY = 'xxxxxxxxxxxxxxxxxxx'
CONSUMER_SECRET = 'xxxxxxxxxxxxxxxxxxx'
#Access
ACCESS_TOKEN = 'xxxxxxxxxxxxxxxxxxxxxxx'
ACCESS_SECRET = 'xxxxxxxxxxxxxxxxxxxxxxx'
# We import our access keys:
from credentials import * # This will allow us to use the keys as variables
# API's setup:
def twitter_setup():
"""
Utility function to setup the Twitter's API
with our access keys provided.
"""
# Authentication and access using keys:
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)
# Return API with authentication:
api = tweepy.API(auth)
return api
#Create an extractor object:
extractor = twitter_setup()
#We create tweet list as follows:
searchword = input("Enter the name to search:")
tweets = extractor.user_timeline(screen_name = searchword, count=200)
print("Number of tweets extracted : {}\n".format(len(tweets)))
#most recent tweet
print("3 most recent tweets:\n")
for tweet in tweets[:3]:
print(tweet.text)
print()
#Create panda dataframe
data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])
#Add relevant data:
data['ID'] = np.array([tweet.id for tweet in tweets])
data['Date'] = np.array([tweet.created_at for tweet in tweets])
data['Likes'] = np.array([tweet.favorite_count for tweet in tweets])
data['Source'] = np.array([tweet.source for tweet in tweets])
#Sentiment asnalysis using watson
service = ToneAnalyzerV3(
url="https://gateway.watsonplatform.net/tone-analyzer/api",
username= "9adf63ab-8b6f-457b-a56a-33c25e0997c6",
password= "qbms20FWWUNB")
#Sentiment Analysis on tweets:
#from textblob import TextBlob
#import re
#Clean tweets:
def clean_tweet(tweet):
return ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)", " ", tweet).split())
tone_chat = service.tone_chat(clean_tweet(tweet))
def analize_sentiment(tweet):
tone_chat = service.tone_chat(clean_tweet(tweet))
# analysis = TextBlob(clean_tweet(tweet))
# if analysis.sentiment.polarity > 0:
# return 1
# elif analysis.sentiment.polarity == 0:
# return 0
# else:
# return -1
#Create a new column and display the new result:
data['SA'] = np.array(analize_sentiment(tweet) for tweet in data['Tweets']])
display(data.head(10))
#Construct classified list:
positive_tweets = [tweet for index, tweet in enumerate(data['Tweets']) if data['SA'][index] > 0]
neutral_tweets = [tweet for index, tweet in enumerate(data['Tweets']) if data['SA'][index] == 0]
negative_tweets = [tweet for index, tweet in enumerate(data['Tweets']) if data['SA'][index] < 0]
#We print percentages:
print("Percentage of positive tweets: {}%".format(len(positive_tweets)*100/len(data['Tweets'])))
print("Percentage of negative tweets: {}%".format(len(negative_tweets)*100/len(data['Tweets'])))
print("Percentage of neutral tweets: {}%".format(len(neutral_tweets)*100/len(data['Tweets'])))
# Pie chart:
ptweets = len(positive_tweets)*100/len(data['Tweets'])
neutweets = len(neutral_tweets)*100/len(data['Tweets'])
negtweets = len(negative_tweets)*100/len(data['Tweets'])
labels = 'negative','neutral','positive'
sizes = [negtweets,neutweets,ptweets]
cols = ['c','m','r']
plt.pie(sizes,labels=labels,colors=cols,startangle=90,shadow=True,autopct='%1.1f%%')
plt.legend()
plt.title("Overall Sentiment of the people in analzing 200 tweets")
plt.show()
我已经注释了TextBlob的旧代码,因此请忽略它。 使用spyder执行它时,出现以下错误:
File "<ipython-input-12-4e56d84102b8>", line 93
data['SA'] = np.array(analize_sentiment(tweet) for tweet in data['Tweets']])
请帮助我解决此错误,如果有任何错误,请详细提及,因为我可能不明白。
您可以直接在Sypder中运行此代码以重现错误并找出解决方案,请帮帮我。
谢谢