python:在不知道结构的情况下检索JSON数据的一部分

时间:2017-04-14 13:32:52

标签: python json data-structures

我有这个JSON:

 {u'spreadsheetId': u'19CugmHB1Ds6n1jBy4Zo4hk_k4sQsTmOFfccxRc2qo', 
    u'properties': {u'locale': u'en_US', u'timeZone': u'Asia/Hong_Kong',
    u'autoRecalc': u'ON_CHANGE', u'defaultFormat': {u'padding': {u'top': 2, u'right': 3, u'left': 3, u'bottom': 2}, u'textFormat': {u'foregroundColor': {}, u'bold': False, u'strikethrough': False, u'fontFamily': u'arial,sans,sans-serif', u'fontSize': 10, u'italic': False, u'underline': False}, u'verticalAlignment': u'BOTTOM', u'backgroundColor': {u'blue': 1, u'green': 1, u'red': 1}, u'wrapStrategy': u'OVERFLOW_CELL'}, u'title': u'test pygsheets API V4'}, u'sheets': [{u'properties': {u'sheetType': u'GRID', u'index': 0, u'sheetId': 0, u'gridProperties': {u'columnCount': 26, u'rowCount': 1000}, u'title': u'IO'}}, {u'basicFilter': {u'range': {u'endRowIndex': 978, u'startRowIndex': 2, u'sheetId': 1704577069, u'startColumnIndex': 1, u'endColumnIndex': 9}, u'sortSpecs': [{u'sortOrder': u'ASCENDING', u'dimensionIndex': 1}, {u'sortOrder': u'ASCENDING', u'dimensionIndex': 4}, {u'sortOrder': u'ASCENDING', u'dimensionIndex': 5}, {u'sortOrder': u'ASCENDING', u'dimensionIndex': 8}, {u'sortOrder': u'ASCENDING', u'dimensionIndex': 3}, {u'sortOrder': u'ASCENDING', u'dimensionIndex': 7}, {u'sortOrder': u'ASCENDING', u'dimensionIndex': 2}]}, u'properties': {u'sheetType': u'GRID', u'index': 1, u'title': u'books', u'gridProperties': {u'columnCount': 22, u'rowCount': 978, u'frozenColumnCount': 3, u'hideGridlines': True, u'frozenRowCount': 3}, u'tabColor': {u'blue': 1}, u'sheetId': 1704577069}}], u'spreadsheetUrl': u'https://docs.google.com/spreadsheets/d/1CugmHB1Ds6n1jBy4Zo4hk_k4sQsTmOFfccxRc2qo/edit'}

如何仅从title的JSON中获取sheets?我想要像

这样的东西

输入:results.get('title')

输出:['IO','books']

由于嵌套结构,我不确定如何处理它。这提醒了一个html节点类型结构。所以我需要某种类型的搜索功能?

有没有办法在不查看结构的情况下到达title个节点?有点像xpath搜索类型的功能?我之前使用过beautifulsoup,你可能不知道结构并通过搜索取出部分数据。

2 个答案:

答案 0 :(得分:3)

这将提供您想要的输出:

print [x['properties'].get('title') for x in results['sheets']]

返回:[u'IO', u'books']

答案 1 :(得分:1)

这应该有效:

a = {your json/dict?}
print(a['properties']['title']) # prints 'test pygsheets API V4'
print(a['sheets'][0]['properties']['title']) #prints 'IO'
print(a['sheets'][1]['properties']['title']) # prints 'books'

编辑: 对于未知结构:

def find_in_obj(obj, condition, path=None):

    if path is None:
        path = []

    # In case this is a list
    if isinstance(obj, list):
        for index, value in enumerate(obj):
            new_path = list(path)
            for result in find_in_obj(value, condition, path=new_path):
                yield result

    # In case this is a dictionary
    if isinstance(obj, dict):
        for key, value in obj.items():
            new_path = list(path)
            for result in find_in_obj(value, condition, path=new_path):
                yield result

            if condition == key:
                new_path = list(path)
                new_path.append(value)
                yield new_path

results = []
for item in find_in_obj(a, 'title'):
    results.append(item)
print(results) #prints [['test pygsheets API V4'], ['IO'], ['books']]

修改自:Find all occurrences of a key in nested python dictionaries and lists

的hexerei软件解决方案