我在文件中有这样的一系列值。
{
"canceled": false,
"complete_time": "2017-06-08T15:55:45.616942",
"create_time": "2017-06-08T15:55:44.370344",
"entity_list": [
{
"entity_type": 2,
"uuid": "xxxxx"
},
{
"entity_name": "",
"uuid": "xxxx"
}
],
"last_updated_time": "2017-06-08T15:55:45.616942",
"progress_status": 3,
"request": {
"arg": {
"parent_task_uuid": "xxx",
"task_uuid": "xxxx",
"transition": 2,
"vm_uuid": "xxx"
},
"method_name": """""
},
"response": {
"error_code": 0,
"error_detail": "",
"ret": {}
},
"start_time": "2017-06-08T15:55:44.452703",
"uuid": "xxxxx"
}
{
"canceled": false,
"complete_time": "2017-06-08T15:55:45.616942",
"create_time": "2017-06-08T15:55:44.370344",
"entity_list": [
{
"entity_type": 2,
"uuid": "xxxxx"
},
{
"entity_name": "",
"uuid": "xxxx"
}
],
"last_updated_time": "2017-06-08T15:55:45.616942",
"progress_status": 3,
"request": {
"arg": {
"parent_task_uuid": "xxx",
"task_uuid": "xxxx",
"transition": 2,
"vm_uuid": "xxx"
},
"method_name": """""
},
"response": {
"error_code": 0,
"error_detail": "",
"ret": {}
},
"start_time": "2017-06-08T15:55:44.452703",
"uuid": "xxxxx"
}
我想根据'last_updated_field'对这些单独的块{}进行排序。如果它像JSON一样,我已经编写了代码来使用Python,但由于这不是一个有效的JSON,我怎样才能使它工作。
while True:
line = sys.stdin.readline()
if not line: break
line = line.strip()
json_obj = json.loads(line)
lines.append(json_obj)
lines = sorted(lines, key=lambda k: k['last_updated_time'], reverse=True)
答案 0 :(得分:0)
您可以尝试累积行,直到形成有效的json。把所有这些jsons放在一个列表中,然后按照你知道的方式对它进行排序。
之类的东西import sys
all_jsons = []
current_lines = []
counter = 0
while True:
line = sys.stdin.readline()
if not line:
break
line = line.strip()
current_lines.append(line)
nb_inc = line.count("{")
nb_dec = line.count("}")
counter += nb_inc
counter -= nb_dec
if counter == 0:
# We have met as many opening bracket as closing, this is a full json
all_jsons.append("\n".join(current_lines))
current_lines = []
# sort your json files
在这里我积累了直到我遇到尽可能多的关闭括号。在这种情况下,我将所有行连接成一个json格式的字符串。
答案 1 :(得分:0)
所以你的问题并不完全是对这些对象进行排序,你的最后一行应该可以正常工作。问题是如何读取略有格式错误的json文件。这是一种消化文件的hacky方式。你可以谷歌搜索一个更好地容忍格式偏差的json包。
tests
要恢复原始可怕的格式,请添加以下两行
import json
chunks = list()
temp_lines = list()
fp = open(FILE_PATH)
for line in fp:
line = line.replace(r'"""""', r'"\""')
temp_lines.append(line)
if line.startswith('}'):
chunks.append(json.loads(''.join(temp_lines)))
temp_lines.clear()
chunks = sorted(chunks, key=lambda x: x['last_updated_time'], reverse=True)