下载文件时的Python多处理比正常实现慢

时间:2018-07-02 14:03:42

标签: python python-3.x

我正在尝试从索引中下载一些随机文章(比如说5000)。我尝试使用多处理程序生成该程序,但是多处理程序版本的运行速度是常规版本的两倍以上。为什么会这样?

附加代码:

import time
import bs4
import requests
import os
import urllib.parse
import re
import random
from multiprocessing import Process
from functools import partial
import multiprocessing as mp

list_of_files = os.listdir("..\\sample_set")

# Read index file
index = open("C:\\Users\\useradmin\\Desktop\\index.txt", "r", encoding="utf-8")
lines = index.readlines()

# Seperate random number list into two lists
my_randoms = random.sample(range(1, 18458000), 5000)
input = []
for num in my_randoms:
    input.append(lines[num].split(":")[2].strip("\n"))

start_time = time.clock()

from multiprocessing import Pool

def job(title):
    if not any(title in file for file in list_of_files):
        # Download file using requests.get()

if __name__ == '__main__':
    pool = Pool()
    pool.map(job, input)
    print(time.clock() - start_time, "seconds")

谢谢!

0 个答案:

没有答案