我有一个如下所示的JSON字符串:
b'[{"status_verify":"0","dejatime_firstpaint":"0","fullip":"104.25.229.34","ctl_devlog":"69-131041194","resptime_fullpage":"0","dt_status":"2018-11-25 00:00:21","notified":"0","resptime_connect":"0.08799999952316284","http_resp_length":"0","resptime_firstbyte":"0.6819999814033508","obj_location":"31","max_fullpage_status":"-1","resptime_dns":"0","dejatime_pageload":"0","status":"0","resptime_redirect":"0","capture_exists":"0","resptime_content":"0.08799999952316284","rs_has_dejatime":"0","obj_cust":"117396","obj_device":"470630","childnodes":"0","deja_branched":"0","http_status":"HTTP 200 OK","info_msg":null,"device_descrip":"Get Public Datasets","dejatime_domload":"0","user_experience":"0","location_descrip":"San Francisco, California","dejatime_afttime":"0","resptime":"0.8579999804496765","obj_devlog":"152050515","test_cnt":"0","status_warning":"0"},{"status_verify":"0"
etc, etc, etc.
我正在尝试挑选一些字段,并将其加载到数据帧中,或者只是将所有内容加载到数据帧中。问题是,这些都是嵌套的,而且我不知道如何从这个庞大的字符串中获取实际的字段名称。
我尝试过:
loaded_json = json.loads(json_data)
for x in loaded_json:
print("%s: %d" % (x, loaded_json[x]))
得到以下结果:TypeError: list indices must be integers or slices, not dict
。我想这应该很简单,但是我不确定如何进行,即使在谷歌搜索了一段时间后也找到了解决办法。
答案 0 :(得分:1)
这是因为您的顶部元素是列表,您需要从列表中读取而不是从dict
中读取
import json
x = b'[{"status_verify":"0","dejatime_firstpaint":"0","fullip":"104.25.229.34","ctl_devlog":"69-131041194","resptime_fullpage":"0","dt_status":"2018-11-25 00:00:21","notified":"0","resptime_connect":"0.08799999952316284","http_resp_length":"0","resptime_firstbyte":"0.6819999814033508","obj_location":"31","max_fullpage_status":"-1","resptime_dns":"0","dejatime_pageload":"0","status":"0","resptime_redirect":"0","capture_exists":"0","resptime_content":"0.08799999952316284","rs_has_dejatime":"0","obj_cust":"117396","obj_device":"470630","childnodes":"0","deja_branched":"0","http_status":"HTTP 200 OK","info_msg":null,"device_descrip":"Get Public Datasets","dejatime_domload":"0","user_experience":"0","location_descrip":"San Francisco, California","dejatime_afttime":"0","resptime":"0.8579999804496765","obj_devlog":"152050515","test_cnt":"0","status_warning":"0"}]'
y = json.loads(x)
print(y[0]['status_verify'])
# output,
0
答案 1 :(得分:1)
用于加载数据然后仅将一些字段放入DataFrame的一行应为:
df = pd.DataFrame(json.loads(x), columns=['status_verify', 'fullip', 'ctl_devlog'])
祝您项目顺利!