Question

由于数据库的限制，我需要将线程合并到我的代码中。我的问题是我有一个字典列表（约850个元素）和一个元素列表（相同长度），并且一次只能查询50个。因此，我使用生成器将列表分成50个大块。

def list_split(ls):
    n = 50
    for i in range(0, len(ls), n):
        yield ls[i:i + n]

然后，我将这两个列表都传递给一个函数，该函数本质上将每个列表附加到新字典中。每个字典的 value 将是查询，每次查询大约需要2秒。

def query(ls1, ls2):

count = 0
query_return_dict = {}

for i, j in zip(ls2, ls1):
    for key, value in zip(i, j):
        # ret = token.query(j) replace 'value' with 'ret' once ready to run
        query_return_dict[key] = value
        count += 1

print(query_return_dict)
return query_return_dict

然后我叫他们：

ls1 = list_split(unchunked_ls1)
ls2 = list_split(unchunked_ls2)

现在这是我不了解此代码块的“单”线程的地方：

def main():
    thread = threading.Thread(target=query, args=(ls1, ls2))
    thread.start()

    thread.join()

if __name__ == '__main__':
    main()

我正在学习通过此site进行线程化，但是我不知道它是否按照我的意愿去做，我只是很犹豫是否要在数据库上实际运行它，因为有备份风险通过查询充斥它。

TL; DR，

我需要确保def query(ls1, ls2):仅在返回来自ls1（词典列表）的50个查询并将其附加到query_return_dict之后才再次开始运行。它可以运行，然后运行下一个50块，直到查询了查询列表中的所有元素。

也：

如果有更好的方法可以做到这一点，那么线程也很棒！

根据要求，这两个列表的格式如下所示，请记住，其中大约有850个：

ls1 = ['34KWR','23SDG','903SD','256DF','41SDA','42DFS',...] <- len 850
ls2 = [{"ity": {"type": "IDE", "butes": [{"ity": {"id": "abc34"}}], "limit": 20}}, ...] <- len 850

Answer 1

如果先压缩，然后压缩，则更简单。另外，让islice一次获得一块。

from itertools import islice


pairs = zip(unchunked_ls1, unchunked_ls2)

# Get the next 50 elements of pairs and return as a list.
# Because pairs is an iterator, not a list, the return value
# of islice changes each time you call it.
def get_query():
    return list(islice(pairs, 50))

# Repeatedly call get_query until it returns an empty list
for query in iter(get_query, []):
    # do your query
    ...

让一个函数等待，直到它返回n个结果

1 个答案: