我现在正在从事自然语言处理项目,但我一开始就习惯用特定语言收集推文。
我试图使用带有python的tweepy库,但是这段代码没有在控制台上给出任何回报
我在哪里做错了?
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import time
import json
# authentication data- get this info from twitter after you create your application
ckey="k6Lqgu45T6ReNFO7OlnKc9zeY"
csecret="hkB5xEApV8fzdhlRGGw35VYqj1AereBriZZgHlf9r0V23NOqY8"
atoken="74417799-Rv6hCRyr1lyv14FCrgIac97AlLy0eSpd0s4hqFx23"
asecret="D0j5HNB1ec4POzxZZemjJ4CvZ0WMcLAK4D0e46r7DaPzF"
# define listener class
class listener(StreamListener):
def on_data(self, data):
try:
print (data) # write the whole tweet to terminal
return True
except BaseException as e:
print('failed on data, ', str(e)) # if there is an error, show what it is
time.sleep(5) # one error could be that you're rate-limited; this will cause the script to pause for 5 seconds
def on_error(self, status):
print (status)
# authenticate yourself
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
twitterStream.filter(languages=['tr']) # track what you want to search for!
答案 0 :(得分:0)
我运行了你的代码并收到了一个406错误,该错误符合API,这意味着查询不是可接受的请求。在过滤器方法中添加轨道术语参数后,它可以正常工作。我相信这是API本身的限制。另请参阅406 error in Streaming API when filtering on language。
答案 1 :(得分:0)
例如,我想搜索包含单词"#tennis"的10000条推文。并打印推文文字和作者
api = tweepy.API(auth)
TestTweet = tweepy.Cursor(api.search, q="#tennis").items(10000)
while True:
try:
tweet = TestTweet.next()
print(str(tweet.author.screen_name))
print(tweet.text)
except tweepy.error.TweepError:
print "Twitter rate limit, need to wait 15 min"
time.sleep(60 * 16)
continue
except StopIteration:
break
如果您想按用户名称搜索
tweet = api.get_status(id=user_name)
test_text = tweet.text
test_user = tweet.user.screen_name