Question

我现在正在从事自然语言处理项目，但我一开始就习惯用特定语言收集推文。

我试图使用带有python的tweepy库，但是这段代码没有在控制台上给出任何回报

我在哪里做错了？

from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import time
import json

# authentication data- get this info from twitter after you create your application
ckey="k6Lqgu45T6ReNFO7OlnKc9zeY"
csecret="hkB5xEApV8fzdhlRGGw35VYqj1AereBriZZgHlf9r0V23NOqY8"
atoken="74417799-Rv6hCRyr1lyv14FCrgIac97AlLy0eSpd0s4hqFx23"
asecret="D0j5HNB1ec4POzxZZemjJ4CvZ0WMcLAK4D0e46r7DaPzF"

# define listener class
class listener(StreamListener):

    def on_data(self, data):
        try:
            print (data)   # write the whole tweet to terminal
            return True
        except BaseException as e:
            print('failed on data, ', str(e)) # if there is an error, show what it is
            time.sleep(5)  # one error could be that you're rate-limited; this will cause the script to pause for 5 seconds

    def on_error(self, status):
        print (status)

# authenticate yourself
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
twitterStream.filter(languages=['tr'])  # track what you want to search for!

Answer 1

我运行了你的代码并收到了一个406错误，该错误符合API，这意味着查询不是可接受的请求。在过滤器方法中添加轨道术语参数后，它可以正常工作。我相信这是API本身的限制。另请参阅406 error in Streaming API when filtering on language。

Answer 2

例如，我想搜索包含单词＆＃34;＃tennis＆＃34;的10000条推文。并打印推文文字和作者

api = tweepy.API(auth)
TestTweet = tweepy.Cursor(api.search, q="#tennis").items(10000)

while True:
  try:
      tweet = TestTweet.next()
      print(str(tweet.author.screen_name))
      print(tweet.text)

 except tweepy.error.TweepError:
      print "Twitter rate limit, need to wait 15 min"
      time.sleep(60 * 16)
      continue
 except StopIteration:
      break

如果您想按用户名称搜索

tweet = api.get_status(id=user_name)
test_text = tweet.text
test_user = tweet.user.screen_name

Python Twitter过滤和收集推文

2 个答案: