Question

我正在使用JSON结构化数据并将其存储在名为dict的Python output中。我知道我通常可以使用.get('value')来查找该值。但是，我不清楚的是如何在列表的一部分中使用.get()并不总是填充。

我的输出：

{
    "entities": [
        {
            "end": 3,
            "entity": "pet",
            "extractor": "ner_crf",
            "processors": [
                "ner_synonyms"
            ],
            "start": 0,
            "value": "Pet"
        },
        {
            "end": 8,
            "entity": "aquatic_facility",
            "extractor": "ner_crf",
            "start": 4,
            "value": "pool"
        },
        {
            "end": 14,
            "entity": "toiletries",
            "extractor": "ner_crf",
            "start": 9,
            "value": "razor"
        }
    ],
    "intent": {
        "confidence": 0.9765,
        "name": "test_intent"
}
}

我正在尝试编写一个语句，将value，razor和pool中的所有Pet存储在一个对象中。也可能未填充entities，仅intent。

在这种情况下，输出可能只是：

{
    "entities": [],
    "intent": {
        "confidence": 0.9765,
        "name": "test_intent"
    }
}

最好的方法是什么？

Answer 1

如果我理解正确，你想要的是将所有值从该字典中提取到一个对象中，这就像comprehension list一样简单，例如：

obj = [v["value"] for v in dct.get("entities",[])]
print(obj)

如果词典中不存在“实体”键，上面的行将返回一个空列表。你会得到：

['Pet', 'pool', 'razor']

Answer 2

如果不保证每个实体字典中都包含值，则可以使用以下内容。

output = {
    "entities": [
        {
            "end": 3,
            "entity": "pet",
            "extractor": "ner_crf",
            "processors": [
                "ner_synonyms"
            ],
            "start": 0,
            "value": "Pet"
        },
        {
            "end": 8,
            "entity": "aquatic_facility",
            "extractor": "ner_crf",
            "start": 4,
            "value": "pool"
        },
        {
            "end": 14,
            "entity": "toiletries",
            "extractor": "ner_crf",
            "start": 9,
            "value": "razor"
        },
        {
            "end": 14,
            "entity": "toiletries",
            "extractor": "ner_crf",
            "start": 9,
        }
],
    "intent": {
        "confidence": 0.9765,
        "name": "test_intent"
    }
}


values = [a.get('value') for a in output.get('entities', []) if 'value' in a]

print(values)

使用Python Dict列出理解

2 个答案: