我写了一个代码来获取网络链接。运行此代码大约需要2:20分钟,这是很多时间,因为它只是代码中的一个函数。我想提高效率。我曾考虑过多线程,但在深入了解并将其应用于此代码方面遇到困难
def get_manufacturer():
manufacturers = requests.get("https://www.gsmarena.com/")
res = re.findall(r"<li><a href=\"samsung-phones-9.php\">.+\n", manufacturers.text)
manufacturer_links = re.findall(r"<li><a href=\"(.+?)\">", res[0])
final_list = []
for i in range(len(manufacturer_links)):
final_list.append("https://www.gsmarena.com/" + manufacturer_links[i])
# find pages
for i in final_list:
req = requests.get(i)
res2 = re.findall(r"<strong>1</strong>(.+)</div>", req.text)
for k in res2:
if k is not None:
pages = re.findall(r"<a href=\"(.+?)\">.<\/a>", res2[0])
for j in range(len(pages)):
final_list.append("https://www.gsmarena.com/" + pages[j])
return final_list
答案 0 :(得分:0)
您可以在以下示例中并行运行for循环
import multiprocessing as mul
def calcIntOfnth(i,ppStr,c,znot):
pool = mul.Pool(mul.cpu_count())
results = pool.starmap(calcIntOfnth, [(i,ppStr,c,znot) for i in range(k)]) # other parameters are local to this statement i.e. ppStr,c,znot,k
pool.close()
您将需要重新编写一个for循环函数,并使用Pool
对象或其他类似方式并行运行它。