我正试图在这里找到一个解决方案,否则,我会发布更多解决方案,但前提是......
我正在尝试使用Python来解析包含事件时间和事件名称的JSON表。每次出现一个特定的“名称”时,我都希望它的相对计数器在该日期增加1 - 所以我可以计算每天发生事件的次数(不关心分钟)。我想要迭代的单个文件夹中还有几个JSON文件。 *以.json
JSON的例子是:
{
"logs" : [ {
"other" : "xVKNXCVNsk",
"time" : "2017-06-15T01:31:50.412Z",
"other2" : "xVKxXCbNsk",
"name" : "Alpha Beta: Bingo"
}, {
"other" : "xVKxXCbNsk",
"time" : "2017-06-15T01:31:37.229Z",
"other2" : "xVKxXCbNsk",
"name" : "Terra Zaba: Bingo"
}, {
"other" : "xVKxXCbNsk",
"time" : "2017-06-15T01:31:37.229Z",
"other2" : "xVKxXCbNsk",
"name" : "Terra Zaba: Bingo"
}]
}
所以对于这个例子,结果将是:
"Alpha Beta: Bingo": 1 for 2017-06-15
"Terra Zaba: Bingo": 2 for 2017-06-15
非常感谢任何帮助!
答案 0 :(得分:0)
首先,您需要重新格式化JSON数据,此处提供的内容包括一些错误。
为了回答这个问题,我将JSON文件的格式设置如下。
{
"logs" : [ {
"other" : "xVKNXCVNsk",
"time" : "2017-06-15T01:31:50.412Z",
"other2" : "xVKxXCbNsk",
"name" : "Alpha Beta: Bingo"
}, {
"other" : "xVKxXCbNsk",
"time" : "2017-06-15T01:31:37.229Z",
"other2" : "xVKxXCbNsk",
"name" : "Terra Zaba: Bingo"
}, {
"other" : "xVKxXCbNsk",
"time" : "2017-06-1T01:31:37.229Z",
"other2" : "xVKxXCbNsk",
"name" : "Terra Zaba: Bingo"
}]
}
格式化JSON文件后,我可以尝试回答这个问题。
import json
from pprint import pprint
PATH = r"E:\temp\temp.json"
def parse_func(f):
"""
parse the JSON file
:param f: the parsed file path
:return: the result dict
"""
parse_Dict = dict()
parse_List = list()
num_Dict = dict()
# load the JSON data to `data `
with open(f) as data_file:
data = json.load(data_file)
# parse the "name " and the "time "
for i in range(0, len(data["logs"])):
parse_List.append(data["logs"][i]["name"])
parse_List.append(data["logs"][i]["time"].split("T")[0])
print(parse_List)
# change the list to the dict
for ii in range(0, len(parse_List), 2):
# the "name " and the "time "
parse_Dict[parse_List[ii]] = parse_List[ii + 1]
# the "name " and the "retry number "
if parse_List[ii] not in num_Dict:
num_Dict[parse_List[ii]] = 1
else:
num_Dict[parse_List[ii]] = num_Dict[parse_List[ii]] + 1
print(parse_Dict)
print(num_Dict)
# format the result_Dict
result_Dict = dict()
for k in parse_Dict.keys():
result_Dict[k] = "%d for %s" % (num_Dict[k], parse_Dict[k])
print(result_Dict)
parse_func(PATH)
代码输出是,
['Alpha Beta: Bingo', '2017-06-15', 'Terra Zaba: Bingo', '2017-06-15', 'Terra Zaba: Bingo', '2017-06-15']
{'Alpha Beta: Bingo': '2017-06-15', 'Terra Zaba: Bingo': '2017-06-15'}
{'Alpha Beta: Bingo': 1, 'Terra Zaba: Bingo': 2}
{'Alpha Beta: Bingo': '1 for 2017-06-15', 'Terra Zaba: Bingo': '2 for 2017-06-15'}
编写代码时我发现了一个问题,需要你自己判断。
因为dict()
中的键需要唯一,当一个元素具有相同的name
但具有不同的time
时。如何录制。
例如,如果我更改了JSON文件。
{
"logs" : [ {
"other" : "xVKNXCVNsk",
"time" : "2017-06-15T01:31:50.412Z",
"other2" : "xVKxXCbNsk",
"name" : "Alpha Beta: Bingo"
}, {
"other" : "xVKxXCbNsk",
"time" : "2017-06-15T01:31:37.229Z",
"other2" : "xVKxXCbNsk",
"name" : "Terra Zaba: Bingo"
}, {
"other" : "xVKxXCbNsk",
"time" : "2016-06-1T01:31:37.229Z",
"other2" : "xVKxXCbNsk",
"name" : "Terra Zaba: Bingo"
}]
}
注意 "time" : "2016-06-1T01:31:37.229Z",
行已更改。
输出将变为
['Alpha Beta: Bingo', '2017-06-15', 'Terra Zaba: Bingo', '2017-06-15', 'Terra Zaba: Bingo', '2016-06-1']
{'Terra Zaba: Bingo': '2016-06-1', 'Alpha Beta: Bingo': '2017-06-15'}
{'Terra Zaba: Bingo': 2, 'Alpha Beta: Bingo': 1}
{'Terra Zaba: Bingo': '2 for 2016-06-1', 'Alpha Beta: Bingo': '1 for 2017-06-15'}
注意请检查上面的不同。