从嵌套的json字典中抓取值

时间:2019-11-13 17:38:23

标签: json regex python-3.x dictionary

我有类似以下示例的数据。它具有dict,它具有作为列表的键。这些列表还包含字典。我想创建一个列表,如下所示,从数据中抓取了“ Id”值。我一直在写循环以提取值,然后按键和和值进行过滤。我在想,也许有一种更简单的方法,例如仅使用正则表达式模式匹配来捕获具有模式“ u'Id':u'integer'”的所有内容。有谁看到更简单的方法,还是可以建议代码从下面的嵌套字典中抓取“ Id”值?

数据:

KStream<String, String> source = builder
    .stream(sourceTopic, Consumed.with(Serdes.String(), Serdes.String()))
    .filter((key, value) -> Objects.nonNull(value));

final KTable<Windowed<String>, List<String>> aggTable = source
    .groupByKey(Serialized.with(Serdes.String(), new JsonSerde<>(String.class, objectMapper)))
    .windowedBy(TimeWindows.of(TimeUnit.SECONDS.toMillis(5))))
    .aggregate(List<String>::new, (key, value, aggregater) -> {
        aggregater.add(value);

        return aggregater;
    },
    Materialized.<String, List<String>, WindowStore<Bytes, byte[]>>as("stateStore")
            .withValueSerde(newStatusEventHolderJsonSerde()));

所需的输出:

{u'distinct': [{u'__class__': u'tuple',
   u'__value__': [{u'Id': u'9624',
     u'classification': u'i',

     u'storeid': u'86'},

    {u'Id': u'41822',
     u'classification': u's/i',

     u'storeid': u'86'}]}],
 u'match': [{u'__class__': u'tuple',
   u'__value__': [{u'Id': u'38916',
     u'classification': u'c',

     u'storeid': u'125'},
    {u'Id': u'49462',
     u'classification': u'n/a',

     u'storeid': u'124'}]},
      {u'Id': u'46525',
     u'classification': u'h',
          u'storeid': u'158'}]}]}

1 个答案:

答案 0 :(得分:0)

我认为您已经在字典中添加了两个额外的]},如果确实如此,那么您可以循环浏览并查看所需的ID在哪里:< / p>

d = {u'distinct': [{u'__class__': u'tuple', u'__value__': [{u'Id': u'9624', u'classification': u'i', u'storeid': u'86'}, {u'Id': u'41822', u'classification': u's/i', u'storeid': u'86'}]}], u'match': [{u'__class__': u'tuple',
                                                                                                                                                                                                         u'__value__': [{u'Id': u'38916', u'classification': u'c', u'storeid': u'125'}, {u'Id': u'49462', u'classification': u'n/a', u'storeid': u'124'}]}, {u'Id': u'46525', u'classification': u'h', u'storeid': u'158'}]}
output = []
for item in d.items():
    for ID in item[1][0]['__value__']:
        output.append(int(ID['Id']))

print(output)

输出

[9624, 41822, 38916, 49462]