如何使用pymongo获取Json文档的所有键?

时间:2015-05-06 08:30:57

标签: python json pymongo

我有一份以下格式的文件,

   {
"glossary": {
    "title": "example glossary",
    "GlossDiv": {
        "title": "S",
        "GlossList": {
            "GlossEntry": {
                "ID": "SGML",
                "SortAs": "SGML",
                "GlossTerm": "Standard Generalized Markup Language",
                "Acronym": "SGML",
                "Abbrev": "ISO 8879:1986",
                "GlossDef": {
                    "para": "A meta-markup language, used to create markup languages such as DocBook.",
                    "GlossSeeAlso": ["GML", "XML"]
                },
                "GlossSee": "markup"
            }
        }
    }
}
}

我需要获取所有密钥,包括嵌套密钥。但我只能获得第一级的键,例如:词汇表。

有人可以告诉我,有没有办法可以检索所有按键?

2 个答案:

答案 0 :(得分:2)

您可以使用递归函数挖掘每一层并打印密钥。

def recurse_keys(document):
    for key in document.keys():
        print(str(key))
        if isinstance(document[key], dict):
           recurse_keys(document[key])

<强>更新  对于嵌套格式

def recurse_keys(document,parent):
    for key in document.keys():
        if parent!="":
            print(parent+'.'+str(key))
        else:
            print str(key)
        if isinstance(document[key], dict):
            if parent!="":
                recurse_keys(document[key],parent+'.'+str(key))
            else:
                recurse_keys(document[key],str(key))

答案 1 :(得分:1)

from re import findall
input_dict = {
"glossary": {
    "title": "example glossary",
    "GlossDiv": {
        "title": "S",
        "GlossList": {
            "GlossEntry": {
                "ID": "SGML",
                "SortAs": "SGML",
                "GlossTerm": "Standard Generalized Markup Language",
                "Acronym": "SGML",
                "Abbrev": "ISO 8879:1986",
                "GlossDef": {
                    "para": "A meta-markup language, used to create markup languages such as DocBook.",
                    "GlossSeeAlso": ["GML", "XML"]
                },
                "GlossSee": "markup"
            }
        }
    }
}
}
dict1=str(input_dict)
pattern = r"'([A-Za-z0-9_\./\\-]*)':"
m = findall(pattern, dict1)
print m

m是: - ['glossary','GlossDiv','GlossList','GlossEntry','GlossDef','GlossSeeAlso','para','GlossSee','缩写','GlossTerm','Abbrev','SortAs',' ID','title','title']

让我告诉你,如果你只想拥有所有键,那么可以正常工作但是如果你希望它们是嵌套形式,那么更好地采用递归方式。一旦完成,就提供递归代码。

如果我目前的解决方案有一些改进,请建议我。