当我尝试读取json文件
for index, js in enumerate(json_files):
with open(os.path.join(path_to_json, js)) as json_file:
json_text = json.load(json_file)
t_id = json_text["id"]
created_at = json_text["created_at"]
text = json_text["text"]
user_name = json_text["user"]["name"]
location = json_text["user"]["location"]
jsons_data.loc[index] = [t_id,created_at,text,user_name,location]
我收到此错误
TypeError:字符串索引必须为整数
这是在我的json文件中
"{\"created_at\":\"Wed Nov 07 06:01:26 +0000 2018\",\"id\":1060049570195853312,\"id_str\":\"1060049570195853312\",\"text\":\"RT @maulinaantika: Tempe Khot News:\\nDiduga pertemuan kontrak politik antara Polri & timses jokowi tahun 2014\\n\\nDalam foto tersebut terlihat\\u2026\",\"source\":\"\\u003ca href=\\\"https:\\/\\/mobile.twitter.com\\\" rel=\\\"nofollow\\\"\\u003eTwitter Lite\\u003c\\/a\\u003e\",\"truncated\"
当我这样尝试
with open('tm.json', 'r') as f:
for line in f:
text = line.encode("utf-8")
json_text = json.loads(text)
print(json_text)
我得到了这个结果
{"created_at":"Sat Dec 08 12:58:14 +0000 2018","id":1071388484609413120,...
有人可以指导我如何解决这个问题吗?
答案 0 :(得分:0)
考虑到您的代码,为什么会出现此错误的最简单解释是:
json_text = json.load(json_file)
正在为您提供一个字符串。您尝试像字典一样使用它:
t_id = json_text["id"] created_at = json_text["created_at"] text = json_text["text"] user_name = json_text["user"]["name"] location = json_text["user"]["location"]
您可以使用try: ... except Exception as e: ...
来避免这种情况,并获取元凶的名字。然后,您可以修复json数据:
for index, js in enumerate(json_files):
with open(os.path.join(path_to_json, js)) as json_file:
json_text = json.load(json_file)
try:
t_id = json_text["id"]
created_at = json_text["created_at"]
text = json_text["text"]
user_name = json_text["user"]["name"]
location = json_text["user"]["location"]
jsons_data.loc[index] = [t_id,created_at,text,user_name,location]
except TypeError as te:
print("Bad json - not a dict: ", os.path.join(path_to_json, js))
print("Json was deserialized into a : ", type(json_text) )
break # exit while, fix your data, do until it works
请参阅: