Twitter流API从无

时间:2016-12-26 16:51:56

标签: python json twitter stream tweepy

我正在使用Twitter的流API(通过tweepy)收集符合特定条件的推文,但是当我使用json.loads()来解析创建的jsonl文件时,我得到以下错误:

File "twitter_time_series.py", line 19, in <module> 
    tweet = json.loads(line)

File "C:\Program Files\Anaconda3\lib\json\__init__.py", line 319, in loads 
    return _default_decoder.decode(s)

File "C:\Program Files\Anaconda3\lib\json\decoder.py", line 339, in decode
   obj, end = self.raw_decode(s, idx=_w(s, 0).end())

File "C:\Program Files\Anaconda3\lib\json\decoder.py", line 357, in raw_decode 
    raise JSONDecodeError("Expecting value", s, err.value) from None

json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1)

我无法确定为什么,与其他类型的jsonl文件一样(如获取推特时间线等),这不会发生。 谁可以帮助我?谢谢!

我正在使用基本的流推文:

auth = get_twitter_auth() 

twitter_stream = Stream(auth, CustomListener(query_fname)) 

twitter_stream.filter(track=['curiosity'], async=True)

并加载json:

fname = sys.argv[1] 

with open(fname, "r") as f: 

    for line in f:
      tweet = json.loads(line)

我正在使用python 3.5和tweepy版本3.3.0,这是json文件的一行:

{"created_at":"Mon Dec 26 16:03:06 +0000 2016",

"id":813414846033170432,

"id_str":"813414846033170432",

"text":"Battle proper and at greater risk https:\/\/t.co\/W6U9Irgst9","source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e",

"truncated":false,

"in_reply_to_status_id":null,

"in_reply_to_status_id_str":null,

"in_reply_to_user_id":null,

"in_reply_to_user_id_str":null,

"in_reply_to_screen_name":null,

"user":{"id":544570972,"id_str":"544570972","name":"Fay Moore","screen_name":"MooreFay","location":"Maryland","url":"http:\/\/faymoore.wordpress.com","description":"Author, writer for hire, crazy woman in a crazy world","protected":false,"verified":false,"followers_count":482,"friends_count":322,"listed_count":22,"favourites_count":92,
"statuses_count":17326,"created_at":"Tue Apr 03 20:30:46 +0000 2012","utc_offset":-18000,"time_zone":"Eastern Time (US & Canada)","geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/3252060596\/e06263f9a226ca0e3f83915812c331e6_normal.jpeg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/3252060596\/e06263f9a226ca0e3f83915812c331e6_normal.jpeg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/544570972\/1360842914","default_profile":true,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},

"geo":null,

"coordinates":null,

"place":null,

"contributors":null,

"is_quote_status":false,

"retweet_count":0,

"favorite_count":0,

"entities":{"hashtags":[],"urls":[{"url":"https:\/\/t.co\/W6U9Irgst9","expanded_url":"http:\/\/a.msn.com\/01\/en-us\/BBxzSWF?ocid=st","display_url":"a.msn.com\/01\/en-us\/BBxzS\u2026","indices":[81,104]}],"user_mentions":[],"symbols"[]},

"favorited":false,

"retweeted":false,

"possibly_sensitive":false,

"filter_level":"low",

"lang":"en",

"timestamp_ms":"1482768186468"}

3 个答案:

答案 0 :(得分:0)

在for循环中,您可以尝试:

  tweets = json.load(f)

或删除for循环并尝试:

{{1}}

答案 1 :(得分:0)

写入文件时,在Listener类中,您可以尝试:

def on_data(self, data):
    with open('filename.json', 'a', newline='\n') as f:
        f.write(data)

我在Windows机器上也遇到了这个问题。这是因为Windows使用CR LF行结尾,而Linux使用LF。

如果您尝试阅读一条推文,那么您不会在这里看到,我认为这不会出错:

json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1)  

这很可能是json python包中的错误,而不是TwitterAPI中的错误。

请参阅:The modern way: use newline=''

更多行尾结尾:On Wikipedia,一如既往。

答案 2 :(得分:0)

我刚才遇到了同样的问题。

我尝试使用json分析器测试json字符串,发现格式错误。

例如,False应该为“ False”,None应该为“ None”。

我的意思是,首先您应该检查json字符串的格式是否正确,一种方法是在专门转换json格式的网站上对其进行测试。

希望答案对您有所帮助。