Python Twitter过滤和收集推文

时间:2015-05-22 16:40:21

标签: python api twitter tweepy

我现在正在从事自然语言处理项目,但我一开始就习惯用特定语言收集推文。

我试图使用带有python的tweepy库,但是这段代码没有在控制台上给出任何回报

我在哪里做错了?

from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import time
import json

# authentication data- get this info from twitter after you create your application
ckey="k6Lqgu45T6ReNFO7OlnKc9zeY"
csecret="hkB5xEApV8fzdhlRGGw35VYqj1AereBriZZgHlf9r0V23NOqY8"
atoken="74417799-Rv6hCRyr1lyv14FCrgIac97AlLy0eSpd0s4hqFx23"
asecret="D0j5HNB1ec4POzxZZemjJ4CvZ0WMcLAK4D0e46r7DaPzF"

# define listener class
class listener(StreamListener):

    def on_data(self, data):
        try:
            print (data)   # write the whole tweet to terminal
            return True
        except BaseException as e:
            print('failed on data, ', str(e)) # if there is an error, show what it is
            time.sleep(5)  # one error could be that you're rate-limited; this will cause the script to pause for 5 seconds

    def on_error(self, status):
        print (status)

# authenticate yourself
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
twitterStream.filter(languages=['tr'])  # track what you want to search for!

2 个答案:

答案 0 :(得分:0)

我运行了你的代码并收到了一个406错误,该错误符合API,这意味着查询不是可接受的请求。在过滤器方法中添加轨道术语参数后,它可以正常工作。我相信这是API本身的限制。另请参阅406 error in Streaming API when filtering on language

答案 1 :(得分:0)

例如,我想搜索包含单词"#tennis"的10000条推文。并打印推文文字和作者

api = tweepy.API(auth)
TestTweet = tweepy.Cursor(api.search, q="#tennis").items(10000)

while True:
  try:
      tweet = TestTweet.next()
      print(str(tweet.author.screen_name))
      print(tweet.text)

 except tweepy.error.TweepError:
      print "Twitter rate limit, need to wait 15 min"
      time.sleep(60 * 16)
      continue
 except StopIteration:
      break

如果您想按用户名称搜索

tweet = api.get_status(id=user_name)
test_text = tweet.text
test_user = tweet.user.screen_name