使用python从json对象提取字段及其路径

时间:2019-06-14 17:31:01

标签: python json elasticsearch indexing

我从elasticsearch索引映射中获得了json对象 我想根据json对象的类型对索引字段进行分组。

https://gist.github.com/akthodu/47404880d2e5b6480a881214d41feb58

长距离

getBoundingClientRect()

文本字段:

act.sdeactLongDescription.properties.id.type
act.properties.actstate.type

当我遍历json.loads给定的输出时,我得到了字符串对象。是否有任何json库可提取bs4之类的内部元素?

1 个答案:

答案 0 :(得分:0)

您可以执行递归函数,并寻找链的末端,如下所示:

import json

d = open('test.json')
test = json.load(d, strict=False)


# You could add the chain of keys to a list
def recursive_items(dictionary):
    for key, value in dictionary.items():
        if type(value) is dict:
            # Value becomes a sub-dictionary
            yield from recursive_items(value)
        else:
            # End of the chain
            yield (key, value)


if __name__ == "__main__":
    for key, value in recursive_items(test):
        print(key, value)

# Regex might work too (this pattern could prob be improved a lot)

import re
pattern = r'[^,]+"type"[^,]+'
matches = re.findall(pattern, string, re.DOTALL)