原因

Question

我有一个测试失败列表，如下所示-

all_failures = [
            'test1/path/to/test1/log/failure_reason1',
            'test1/path/to/test1/log/failure_reason2',
            'test2/path/to/test2/log/failure_reason1',
            'test2/path/to/test2/log/failure_reason2',
            'test3/path/to/test3/log/failure_reason1',
            'test4/path/to/test4/log/failure_reason1',
        ]

我正在尝试通过解析列表中的每个失败来构造类似JSON的对象。到目前为止，我已经尝试编写以下代码-

for failure in all_failures:
    data = failure.split('/',1)
    test = data[0]
    failure_details_dict[test] = []

    data = '/' + data[1]
    data = data.rsplit('/', 1)

    test_details_dict['path'] = data[0] + '/'
    test_details_dict['reason'] = data[1]

    failure_details_dict[test].append(test_details_dict)

    test_details_dict = {}  

for key,value in failure_details_dict.items():
    print(key)
    print(value)
    print()

我得到的输出是-

test4
[{'reason': 'failure_reason1', 'path': '/path/to/test4/log/'}]

test3
[{'reason': 'failure_reason1', 'path': '/path/to/test3/log/'}]

test1
[{'reason': 'failure_reason2', 'path': '/path/to/test1/log/'}]

test2
[{'reason': 'failure_reason2', 'path': '/path/to/test2/log/'}]

预期输出为-

{
    "test1": [
                {
                    "path": "/path/to/test1/log/",
                    "reason": "failure_reason1" 
                },
                {
                    "path": "/path/to/test1/log/",
                    "reason": "failure_reason2"     
                }

            ],
    "test2": [
                {
                    "path": "/path/to/test2/log/",
                    "reason": "failure_reason1" 
                },
                {
                    "path": "/path/to/test2/log/",
                    "reason": "failure_reason2"     
                }
            ],
    "test3": [
                {
                    "path": "/path/to/test3/log/",
                    "reason": "failure_reason1" 
                },
            ],
    "test4": [
                {
                    "path": "/path/to/test4/log/",
                    "reason": "reason1" 
                },
            ]
}

我们可以看到，我无法将 second 路径和失败原因添加到同一密钥。示例-test1和test2有两个失败原因。

有人可以帮忙了解我所缺少的吗？谢谢！

Answer 1

原因

每个循环您都将覆盖failure_details_dict[test]。

解决方案

您只能将列表设置一次。
您可以选择几种方法。

非Python方式（不推荐）

if test not in failure_details_dict:
    failure_details_dict[test] = []

将分配替换为dict.setdefault呼叫。这种方式不会影响与failure_details_dict

failure_details_dict.setdefault(test, [])  # instead of failure_details_dict[test] = []

使用collections.defaultdict代替dict。这种方式将影响与failure_detilas_dict的其他互动。

from collections import defaultdict

failure_details_dict = defaultdict(list)  # instead of {}

示例

我已经重构了您的代码：

all_failures = [
    'test1/path/to/test1/log/failure_reason1',
    'test1/path/to/test1/log/failure_reason2',
    'test2/path/to/test2/log/failure_reason1',
    'test2/path/to/test2/log/failure_reason2',
    'test3/path/to/test3/log/failure_reason1',
    'test4/path/to/test4/log/failure_reason1',
]

failure_details_dict = {}

for failure in all_failures:
    key, *paths, reason = failure.split('/')
    failure_details_dict.setdefault(key, []).append({
        'path': f"/{'/'.join(paths)}/",
        'reason': reason,
    })

for key, value in failure_details_dict.items():
    print(key)
    print(value)
    print()

结论

如果要进行简单的更改，请使用dict.setdefault方法。
如果您对failure_details_dict有多个访问权限，并且希望每次访问都使用默认值，请使用collection.defaultdict类。

额外

我们如何修改代码，以使“路径”键仅被复制一次，并且仅创建具有“原因”键的多个字典？通常，以JSON格式存储数据的最佳方法是什么？

您可以像这样重新格式化JSON：

{
  "test1": {
    "path": "/path/to/test1/log/",
    "reason": [
      "failure_reason1",
      "failure_reason2"
    ]
  },
  "test2": {
    "path": "/path/to/test2/log/",
    "reason": [
      "failure_reason1",
      "failure_reason2"
    ]
  },
  "test3": {
    "path": "/path/to/test3/log/",
    "reason": [
      "failure_reason1"
    ]
  },
  "test4": {
    "path": "/path/to/test4/log/",
    "reason": [
      "reason1"
    ]
  }
}

来自代码：

all_failures = [
    'test1/path/to/test1/log/failure_reason1',
    'test1/path/to/test1/log/failure_reason2',
    'test2/path/to/test2/log/failure_reason1',
    'test2/path/to/test2/log/failure_reason2',
    'test3/path/to/test3/log/failure_reason1',
    'test4/path/to/test4/log/failure_reason1',
]

failure_details_dict = {}

for failure in all_failures:
    key, *paths, reason = failure.split('/')
    failure_details_dict.setdefault(key, {
        'path': f"/{'/'.join(paths)}/",
        'reason': [],
    })['reason'].append(reason)

for key, value in failure_details_dict.items():
    print(key)
    print(value)
    print()

Answer 2

您可以使用正则表达式从故障日志文件名中提取信息。这可以通过以下方式简单地实现：

import re
import json

all_failures = [
            'test1/path/to/test1/log/failure_reason1',
            'test1/path/to/test1/log/failure_reason2',
            'test2/path/to/test2/log/failure_reason1',
            'test2/path/to/test2/log/failure_reason2',
            'test3/path/to/test3/log/failure_reason1',
            'test4/path/to/test4/log/failure_reason1',
        ]


info = dict()
for failure in all_failures:
    match = re.search(r"^(.*?)(/.*/)(.*)$", failure)

    details = dict()
    details["path"] = match.group(2)
    details["reason"] = match.group(3)

    if match.group(1) in info:
        info[match.group(1)].append(details)
    else:
        info[match.group(1)] = []
        info[match.group(1)].append(details)

print(json.dumps(info, indent=4))

输出：

{
    "test1": [
        {
            "path": "/path/to/test1/log/",
            "reason": "failure_reason1"
        },
        {
            "path": "/path/to/test1/log/",
            "reason": "failure_reason2"
        }
    ],
    "test2": [
        {
            "path": "/path/to/test2/log/",
            "reason": "failure_reason1"
        },
        {
            "path": "/path/to/test2/log/",
            "reason": "failure_reason2"
        }
    ],
    "test3": [
        {
            "path": "/path/to/test3/log/",
            "reason": "failure_reason1"
        }
    ],
    "test4": [
        {
            "path": "/path/to/test4/log/",
            "reason": "failure_reason1"
        }
    ]
}

如何将新值附加到词典列表中的相同键上？

2 个答案:

原因

解决方案

示例

结论

额外