我在这里有这个代码,它完美无缺。
# encoding=utf8
#Import the necessary methods from tweepy library
import sys
from tweepy import OAuthHandler
from tweepy import Stream
from tweepy.streaming import StreamListener
reload(sys)
sys.setdefaultencoding('utf8')
#Variables that contains the user credentials to access Twitter API
access_token = ""
access_token_secret = ""
consumer_key = ""
consumer_secret = ""
#This is a basic listener that just prints received tweets to stdout.
class StdOutListener(StreamListener):
def on_data(self, data):
#save data
with open('debate_data.txt', 'a') as tf:
tf.write((data).decode('unicode-escape').encode('utf-8'))
return True
def on_error(self, status):
print status
if __name__ == '__main__':
#This handles Twitter authetification and the connection to Twitter Streaming API
l = StdOutListener()
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
stream = Stream(auth, l)
#This line filter Twitter Streams to capture data by the keywords: 'Bernier', 'Rossello', 'Bernabe'
stream.filter(track=['Bernier', 'Rosselló', 'Rossello', 'Bernabe', 'Lúgaro', 'Lugaro', 'María de Lourdes', 'Maria de Lourdes', 'Cidre'])
但是,当我运行另一段代码时,我得到了错误的答案。
import json
import io
#save the tweets to this path
tweets_data_path = 'debate_data.txt'
tweets_data = []
with io.open(tweets_data_path, 'r') as tweets_file:
for line in tweets_file:
try:
tweet = json.loads(line)
tweets_data.append(tweet)
except:
continue
print len(tweets_data)
该文件有42,188个推文,但是当我运行代码时我只得到291.我认为是编码/解码的东西,但我无法弄清楚是什么。任何帮助都会非常感激。
我在没有任何编码/解码的情况下运行此示例,并且它运行良好。
答案 0 :(得分:2)
仅获得291的原因是json.loads()
抛出一些错误而except
继续发生错误。
我建议您打印错误,如:
except Exception as err:
print err
continue
现在您知道错误原因,并解决它。
您确定debate_data.txt
内的数据格式是json
吗?
答案 1 :(得分:2)
正如agnewee所说,我也建议:
return $cityid[0]->id;