需要将JSON字符串加载到DataFrame中

时间:2018-11-27 22:44:56

标签: python json python-3.x

我有一个如下所示的JSON字符串:

b'[{"status_verify":"0","dejatime_firstpaint":"0","fullip":"104.25.229.34","ctl_devlog":"69-131041194","resptime_fullpage":"0","dt_status":"2018-11-25 00:00:21","notified":"0","resptime_connect":"0.08799999952316284","http_resp_length":"0","resptime_firstbyte":"0.6819999814033508","obj_location":"31","max_fullpage_status":"-1","resptime_dns":"0","dejatime_pageload":"0","status":"0","resptime_redirect":"0","capture_exists":"0","resptime_content":"0.08799999952316284","rs_has_dejatime":"0","obj_cust":"117396","obj_device":"470630","childnodes":"0","deja_branched":"0","http_status":"HTTP 200 OK","info_msg":null,"device_descrip":"Get Public Datasets","dejatime_domload":"0","user_experience":"0","location_descrip":"San Francisco, California","dejatime_afttime":"0","resptime":"0.8579999804496765","obj_devlog":"152050515","test_cnt":"0","status_warning":"0"},{"status_verify":"0"

etc, etc, etc.

我正在尝试挑选一些字段,并将其加载到数据帧中,或者只是将所有内容加载到数据帧中。问题是,这些都是嵌套的,而且我不知道如何从这个庞大的字符串中获取实际的字段名称。

我尝试过:

loaded_json = json.loads(json_data)
for x in loaded_json:
    print("%s: %d" % (x, loaded_json[x]))

得到以下结果:TypeError: list indices must be integers or slices, not dict。我想这应该很简单,但是我不确定如何进行,即使在谷歌搜索了一段时间后也找到了解决办法。

2 个答案:

答案 0 :(得分:1)

这是因为您的顶部元素是列表,您需要从列表中读取而不是从dict中读取

import json

x = b'[{"status_verify":"0","dejatime_firstpaint":"0","fullip":"104.25.229.34","ctl_devlog":"69-131041194","resptime_fullpage":"0","dt_status":"2018-11-25 00:00:21","notified":"0","resptime_connect":"0.08799999952316284","http_resp_length":"0","resptime_firstbyte":"0.6819999814033508","obj_location":"31","max_fullpage_status":"-1","resptime_dns":"0","dejatime_pageload":"0","status":"0","resptime_redirect":"0","capture_exists":"0","resptime_content":"0.08799999952316284","rs_has_dejatime":"0","obj_cust":"117396","obj_device":"470630","childnodes":"0","deja_branched":"0","http_status":"HTTP 200 OK","info_msg":null,"device_descrip":"Get Public Datasets","dejatime_domload":"0","user_experience":"0","location_descrip":"San Francisco, California","dejatime_afttime":"0","resptime":"0.8579999804496765","obj_devlog":"152050515","test_cnt":"0","status_warning":"0"}]'


y = json.loads(x)
print(y[0]['status_verify'])

# output,
0

答案 1 :(得分:1)

用于加载数据然后仅将一些字段放入DataFrame的一行应为:

df = pd.DataFrame(json.loads(x), columns=['status_verify', 'fullip', 'ctl_devlog'])

祝您项目顺利!