Python解析JSON表查找值和增量计数器

时间:2017-06-15 02:03:10

标签: python json

我正试图在这里找到一个解决方案,否则,我会发布更多解决方案,但前提是......

我正在尝试使用Python来解析包含事件时间和事件名称的JSON表。每次出现一个特定的“名称”时,我都希望它的相对计数器在该日期增加1 - 所以我可以计算每天发生事件的次数(不关心分钟)。我想要迭代的单个文件夹中还有几个JSON文件。 *以.json

JSON的例子是:

{
  "logs" : [ {
    "other" : "xVKNXCVNsk",
    "time" : "2017-06-15T01:31:50.412Z",
    "other2" : "xVKxXCbNsk",
    "name" : "Alpha Beta: Bingo"
  }, {
    "other" : "xVKxXCbNsk",
    "time" : "2017-06-15T01:31:37.229Z",
    "other2" : "xVKxXCbNsk",
    "name" : "Terra Zaba: Bingo"
  }, {
    "other" : "xVKxXCbNsk",
    "time" : "2017-06-15T01:31:37.229Z",
    "other2" : "xVKxXCbNsk",
    "name" : "Terra Zaba: Bingo"
  }]
}

所以对于这个例子,结果将是:

"Alpha Beta: Bingo": 1 for 2017-06-15
"Terra Zaba: Bingo": 2 for 2017-06-15

非常感谢任何帮助!

1 个答案:

答案 0 :(得分:0)

首先,您需要重新格式化JSON数据,此处提供的内容包括一些错误。

为了回答这个问题,我将JSON文件的格式设置如下。

{
  "logs" : [ {
    "other" : "xVKNXCVNsk",
    "time" : "2017-06-15T01:31:50.412Z",
    "other2" : "xVKxXCbNsk",
    "name" : "Alpha Beta: Bingo"
  }, {
    "other" : "xVKxXCbNsk",
    "time" : "2017-06-15T01:31:37.229Z",
    "other2" : "xVKxXCbNsk",
    "name" : "Terra Zaba: Bingo"
  }, {
    "other" : "xVKxXCbNsk",
    "time" : "2017-06-1T01:31:37.229Z",
    "other2" : "xVKxXCbNsk",
    "name" : "Terra Zaba: Bingo"
  }]
}

格式化JSON文件后,我可以尝试回答这个问题。

import json
from pprint import pprint

PATH = r"E:\temp\temp.json"

def parse_func(f):
    """
    parse the JSON file
    :param f: the parsed file path
    :return: the result dict
    """
    parse_Dict = dict()
    parse_List = list()
    num_Dict = dict()

    # load the JSON data to `data `
    with open(f) as data_file:
        data = json.load(data_file)

    # parse the "name " and the "time "
    for i in range(0, len(data["logs"])):
        parse_List.append(data["logs"][i]["name"])
        parse_List.append(data["logs"][i]["time"].split("T")[0])

    print(parse_List)

    # change the list to the dict
    for ii in range(0, len(parse_List), 2):
        # the "name " and the "time "
        parse_Dict[parse_List[ii]] = parse_List[ii + 1]

        # the "name " and the "retry number "
        if parse_List[ii] not in num_Dict:
            num_Dict[parse_List[ii]] = 1
        else:
            num_Dict[parse_List[ii]] = num_Dict[parse_List[ii]] + 1
    print(parse_Dict)
    print(num_Dict)

    # format the result_Dict
    result_Dict = dict()
    for k in parse_Dict.keys():
        result_Dict[k] = "%d for %s" % (num_Dict[k], parse_Dict[k])

    print(result_Dict)

parse_func(PATH)

代码输出是,

['Alpha Beta: Bingo', '2017-06-15', 'Terra Zaba: Bingo', '2017-06-15', 'Terra Zaba: Bingo', '2017-06-15']
{'Alpha Beta: Bingo': '2017-06-15', 'Terra Zaba: Bingo': '2017-06-15'}
{'Alpha Beta: Bingo': 1, 'Terra Zaba: Bingo': 2}
{'Alpha Beta: Bingo': '1 for 2017-06-15', 'Terra Zaba: Bingo': '2 for 2017-06-15'}

编写代码时我发现了一个问题,需要你自己判断。

因为dict()中的键需要唯一,当一个元素具有相同的name但具有不同的time时。如何录制。

例如,如果我更改了JSON文件。

{
  "logs" : [ {
    "other" : "xVKNXCVNsk",
    "time" : "2017-06-15T01:31:50.412Z",
    "other2" : "xVKxXCbNsk",
    "name" : "Alpha Beta: Bingo"
  }, {
    "other" : "xVKxXCbNsk",
    "time" : "2017-06-15T01:31:37.229Z",
    "other2" : "xVKxXCbNsk",
    "name" : "Terra Zaba: Bingo"
  }, {
    "other" : "xVKxXCbNsk",
    "time" : "2016-06-1T01:31:37.229Z",
    "other2" : "xVKxXCbNsk",
    "name" : "Terra Zaba: Bingo"
  }]
}

注意 "time" : "2016-06-1T01:31:37.229Z",行已更改。

输出将变为

['Alpha Beta: Bingo', '2017-06-15', 'Terra Zaba: Bingo', '2017-06-15', 'Terra Zaba: Bingo', '2016-06-1']
{'Terra Zaba: Bingo': '2016-06-1', 'Alpha Beta: Bingo': '2017-06-15'}
{'Terra Zaba: Bingo': 2, 'Alpha Beta: Bingo': 1}
{'Terra Zaba: Bingo': '2 for 2016-06-1', 'Alpha Beta: Bingo': '1 for 2017-06-15'}

注意请检查上面的不同。