Question

我有里面的多个词典列表（作为JSON）。我有一个值列表，基于该值，我想要该特定值的JSON对象。例如。

[{'content_type': 'Press Release',
  'content_id': '1',
   'Author':John},
{'content_type': 'editorial',
  'content_id': '2',
   'Author': Harry
},
{'content_type': 'Article',
  'content_id': '3',
   'Author':Paul}]

我想获取作者是保罗的完整对象。这是我到目前为止所做的代码。

import json
newJson = "testJsonNewInput.json"
ListForNewJson = []
def testComparision(newJson,oldJson):
   with open(newJson, mode = 'r') as fp_n:
    json_data_new = json.load(fp_n) 
for jData_new in json_data_new:
    ListForNewJson.append(jData_new['author'])

如果需要任何其他信息，请询问。

Answer 1

案例1
一次访问

读取您的数据并对其进行迭代完全没问题，返回找到的第一个匹配项。

def access(f, author):
    with open(file) as f:
        data = json.load(f)

    for d in data:
        if d['Author'] == author:
            return d
    else:
        return 'Not Found'

案例2
重复访问

在这种情况下，以这样的方式重塑数据是明智的：通过作者名称访问对象要快得多（想想字典！）。

例如，一种可能的选择是：

with open(file) as f:
    data = json.load(f)

newData = {}
for d in data:
    newData[d['Author']] = d

现在，定义一个函数并传递预加载的数据以及作者姓名列表。

def access(myData, author_list):
    for a in author_list:
        yield myData.get(a)

该函数的调用如下：

for i in access(newData, ['Paul', 'John', ...]):
    print(i)

或者，将结果存储在列表r中。 list(...)是必要的，因为yield会返回一个生成器对象，您必须通过迭代来消耗它。

r = list(access(newData, [...]))

Answer 2

为什么不做这样的事情？它应该很快，你不必加载那些不会被搜索的作者。

alreadyknown = {}
list_of_obj = [{'content_type': 'Press Release',
    'content_id': '1',
    'Author':'John'},
    {'content_type': 'editorial',
    'content_id': '2',
    'Author': 'Harry'
    },
    {'content_type': 'Article',
    'content_id': '3',
    'Author':'Paul'}]
def func(author):
    if author not in alreadyknown:
        obj = get_obj(author)
        alreadyknown[author] = obj
    return alreadyknown[author]
def get_obj(auth):
    return [obj for obj in list_of_obj if obj['Author'] is auth]
print(func('Paul'))

如何选择具有特定值的特定JSON对象？

2 个答案: