尝试解析json时python'字符串索引必须为整数'

时间:2018-12-16 08:44:14

标签: python twitter

当我尝试读取json文件

for index, js in enumerate(json_files):
    with open(os.path.join(path_to_json, js)) as json_file:
        json_text = json.load(json_file)
        t_id = json_text["id"]
        created_at = json_text["created_at"]
        text = json_text["text"]
        user_name = json_text["user"]["name"]
        location = json_text["user"]["location"]
        jsons_data.loc[index] = [t_id,created_at,text,user_name,location]

我收到此错误

TypeError:字符串索引必须为整数

这是在我的json文件中

"{\"created_at\":\"Wed Nov 07 06:01:26 +0000 2018\",\"id\":1060049570195853312,\"id_str\":\"1060049570195853312\",\"text\":\"RT @maulinaantika: Tempe Khot News:\\nDiduga pertemuan kontrak politik antara Polri & timses jokowi tahun 2014\\n\\nDalam foto tersebut terlihat\\u2026\",\"source\":\"\\u003ca href=\\\"https:\\/\\/mobile.twitter.com\\\" rel=\\\"nofollow\\\"\\u003eTwitter Lite\\u003c\\/a\\u003e\",\"truncated\"

当我这样尝试

with open('tm.json', 'r') as f:
    for line in f:
        text = line.encode("utf-8")
        json_text = json.loads(text)

print(json_text)

我得到了这个结果

{"created_at":"Sat Dec 08 12:58:14 +0000 2018","id":1071388484609413120,...

有人可以指导我如何解决这个问题吗?

1 个答案:

答案 0 :(得分:0)

考虑到您的代码,为什么会出现此错误的最简单解释是:

json_text = json.load(json_file)

正在为您提供一个字符串。您尝试像字典一样使用它:

 t_id = json_text["id"]
 created_at = json_text["created_at"]
 text = json_text["text"]
 user_name = json_text["user"]["name"]
 location = json_text["user"]["location"] 

您可以使用try: ... except Exception as e: ...来避免这种情况,并获取元凶的名字。然后,您可以修复json数据:

for index, js in enumerate(json_files):
    with open(os.path.join(path_to_json, js)) as json_file:
        json_text = json.load(json_file)
        try:
            t_id = json_text["id"]
            created_at = json_text["created_at"]
            text = json_text["text"]
            user_name = json_text["user"]["name"]
            location = json_text["user"]["location"]
            jsons_data.loc[index] = [t_id,created_at,text,user_name,location]
        except TypeError as te:
            print("Bad json - not a dict: ", os.path.join(path_to_json, js))
            print("Json was deserialized into a : ", type(json_text) )
            break # exit while, fix your data, do until it works

请参阅: