我来自韩国,所以你可能不清楚。
我有一个关于通过python加载json文件的问题。
with open("D:/twitter/salathe-us-twitter/20110312/SB_DATA_SB/tweets.txt.2011-03-12_01", encoding='UTF8') as f:
for line in f:
temp = line.partition('|')
date.append(temp[0]) # date
tweets_data.append(temp[2])
这是我的python代码。 我分开了这条线,因为有一些错误。
临时看起来像:
('20110302141002236', '|', '{"user":{"following":null,"profile_background_image_url":"http:\\/\\/a3.twimg.com\\/profile_background_images\\/141128439\\/2010-07-01_15.33.10.jpg","favourites_count":1,"verified":false,"time_zone":"Pacific Time (US & Canada)","profile_text_color":"333333","follow_request_sent":null,"profile_sidebar_fill_color":"DDEEF6","id_str":"173736821","profile_background_tile":false,"followers_count":19,"created_at":"Mon Aug 02 06:37:45 +0000 2010","description":"Attend CWU and just tryna do me.","is_translator":false,"show_all_inline_media":false,"geo_enabled":true,"profile_link_color":"0084B4","location":"Tacoma, WA","listed_count":1,"profile_sidebar_border_color":"C0DEED","protected":false,"profile_image_url":"http:\\/\\/a3.twimg.com\\/profile_images\\/1208687030\\/Twitter_normal.jpg","lang":"en","name":"Quintin Brown","contributors_enabled":false,"statuses_count":340,"notifications":null,"profile_use_background_image":true,"screen_name":"QBrown15","id":173736821,"utc_offset":-28800,"friends_count":48,"profile_background_color":"C0DEED","url":"http:\\/\\/www.facebook.com\\/#!\\/profile.php?id=1195837597"},"in_reply_to_screen_name":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"contributors":null,"coordinates":null,"retweeted":false,"text":"\\"RT @barr253 I love fat asses. #honesttweet\\" <<< Naw, that\'s an #ObviousTweet","in_reply_to_user_id_str":null,"retweet_count":0,"in_reply_to_status_id":null,"id_str":"43025281954480130","source":"web","created_at":"Wed Mar 02 19:10:01 +0000 2011","truncated":false,"entities":{"user_mentions":[{"indices":[4,12],"id_str":"204626247","name":"John Barr","screen_name":"barr253","id":204626247}],"urls":[],"hashtags":[{"indices":[31,43],"text":"honesttweet"},{"indices":[73,86],"text":"ObviousTweet"}]},"geo":null,"place":{"bounding_box":{"type":"Polygon","coordinates":[[[-120.597461,46.966947],[-120.518162,46.966947],[-120.518162,47.029281],[-120.597461,47.029281]]]},"place_type":"city","name":"Ellensburg","country":"United States","attributes":{},"id":"c95cdb2a983262e5","full_name":"Ellensburg, WA","country_code":"US","url":"http:\\/\\/api.twitter.com\\/1\\/geo\\/id\\/c95cdb2a983262e5.json"},"favorited":false,"id":43025281954480130}\n')
('\n', '', '')
你可以看到('\ n','','')。这就是我对它们进行分区的原因。
所以我试着将temp [2]放到json.loads()的参数中。
但它说
C:\Python34\python.exe D:/Twitter_project/TEST.py
Traceback (most recent call last):
File "D:/Twitter_project/TEST.py", line 5, in <module>
a = json.loads(temp[2])
File "C:\Python34\lib\json\__init__.py", line 318, in loads
return _default_decoder.decode(s)
File "C:\Python34\lib\json\decoder.py", line 343, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python34\lib\json\decoder.py", line 361, in raw_decode
raise ValueError(errmsg("Expecting value", s, err.value)) from None
ValueError: Expecting value: line 1 column 1 (char 0)
我的代码有什么问题? 数据不是json格式?
所以现在,我只是使用try,exception来读取临时数据。
有效。但我想知道它为什么会发生以及如何解决它。
这是我的尝试,除了代码:
import json
date = []
tweets_data = []
with open("D:/twitter/salathe-us- twitter/20110302/SB_DATA_SB/tweets.txt.2011-03-02_14-05", encoding='UTF8') as f:
for i,line in enumerate(f):
try:
temp = line.partition('|')
date.append(temp[0])
tweet = json.loads(temp[2])
tweets_data.append(tweet)
except:
continue
答案 0 :(得分:0)
首先,我已经尝试过您根据此validator拥有有效的JSON格式,而且您已经拥有。
然后我尝试编写相同的代码,以下代码对我来说没有任何例外。
import json
temp = ('20110302141002236', '|', '{"user":{"following":null,"profile_background_image_url":"http:\\/\\/a3.twimg.com\\/profile_background_images\\/141128439\\/2010-07-01_15.33.10.jpg","favourites_count":1,"verified":false,"time_zone":"Pacific Time (US & Canada)","profile_text_color":"333333","follow_request_sent":null,"profile_sidebar_fill_color":"DDEEF6","id_str":"173736821","profile_background_tile":false,"followers_count":19,"created_at":"Mon Aug 02 06:37:45 +0000 2010","description":"Attend CWU and just tryna do me.","is_translator":false,"show_all_inline_media":false,"geo_enabled":true,"profile_link_color":"0084B4","location":"Tacoma, WA","listed_count":1,"profile_sidebar_border_color":"C0DEED","protected":false,"profile_image_url":"http:\\/\\/a3.twimg.com\\/profile_images\\/1208687030\\/Twitter_normal.jpg","lang":"en","name":"Quintin Brown","contributors_enabled":false,"statuses_count":340,"notifications":null,"profile_use_background_image":true,"screen_name":"QBrown15","id":173736821,"utc_offset":-28800,"friends_count":48,"profile_background_color":"C0DEED","url":"http:\\/\\/www.facebook.com\\/#!\\/profile.php?id=1195837597"},"in_reply_to_screen_name":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"contributors":null,"coordinates":null,"retweeted":false,"text":"\\"RT @barr253 I love fat asses. #honesttweet\\" <<< Naw, that\'s an #ObviousTweet","in_reply_to_user_id_str":null,"retweet_count":0,"in_reply_to_status_id":null,"id_str":"43025281954480130","source":"web","created_at":"Wed Mar 02 19:10:01 +0000 2011","truncated":false,"entities":{"user_mentions":[{"indices":[4,12],"id_str":"204626247","name":"John Barr","screen_name":"barr253","id":204626247}],"urls":[],"hashtags":[{"indices":[31,43],"text":"honesttweet"},{"indices":[73,86],"text":"ObviousTweet"}]},"geo":null,"place":{"bounding_box":{"type":"Polygon","coordinates":[[[-120.597461,46.966947],[-120.518162,46.966947],[-120.518162,47.029281],[-120.597461,47.029281]]]},"place_type":"city","name":"Ellensburg","country":"United States","attributes":{},"id":"c95cdb2a983262e5","full_name":"Ellensburg, WA","country_code":"US","url":"http:\\/\\/api.twitter.com\\/1\\/geo\\/id\\/c95cdb2a983262e5.json"},"favorited":false,"id":43025281954480130}\n')
('\n', '', '')
a = json.loads(temp[2])
在Python 3.5,3.3
上测试