Python多处理似乎正在影响程序的结果

时间:2019-01-05 02:47:47

标签: python-3.x multiprocessing

我有一个程序,该程序应从集合中访问每个给定的URL并下载图片。原始程序运行缓慢,因此我实施了多处理来加快速度。但是现在,新程序不再下载与原始程序相同的图片。似乎它正在跳过某些URL。这可能与多重处理有关吗?如果两个过程试图同时将照片保存到我的计算机上怎么办?它会引起问题并可能忽略一个问题吗?

没有多重处理的原始程序

if ($json = mysqli_fetch_all($stmt, MYSQLI_ASSOC)) {

    foreach (array_keys($json) as $key) {
        $json[$key]['mark'] = 23;
    }

} else {
    $json['max'] = true;
}

echo json_encode($json);

具有多处理功能的新程序:

def accessAndSaveFiles(urlSet, user, verboseFlag):
    for url in urlSet:
        ...
        img_data = requests.get(url, allow_redirects=True)
        open(filePath, 'wb').write(img_data.content)

def main():
    ...
    accessAndSaveFiles(urlSet, user, verboseFlag)
    ...

谢谢您的帮助!

1 个答案:

答案 0 :(得分:1)

没有足够的信息来调试,但是您可以通过添加一些打印语句来查看自己在每个工作程序中正在运行的内容,从而对自己进行调试。示例:

import multiprocessing as mp
from itertools import repeat
import time

def accessAndSaveFiles(urlSet, user, verboseFlag):
    with mp.Pool() as pool:
        pool.starmap(processURL, zip(urlSet, repeat(user), repeat(verboseFlag)))

def processURL(url, user, verboseFlag):
    print(mp.current_process().name,url,user,verboseFlag)
    time.sleep(1) # Simulated work
    print(mp.current_process().name,'done')

def main():
    accessAndSaveFiles('abcdefghijklmnop', 'me', True)

if __name__ == '__main__':
    main()

输出:

SpawnPoolWorker-2 a me True
SpawnPoolWorker-4 b me True
SpawnPoolWorker-7 c me True
SpawnPoolWorker-1 d me True
SpawnPoolWorker-6 e me True
SpawnPoolWorker-3 f me True
SpawnPoolWorker-5 g me True
SpawnPoolWorker-8 h me True
SpawnPoolWorker-2 done
SpawnPoolWorker-2 i me True
SpawnPoolWorker-4 done
SpawnPoolWorker-4 j me True
SpawnPoolWorker-6 done
SpawnPoolWorker-7 done
SpawnPoolWorker-3 done
SpawnPoolWorker-1 done
SpawnPoolWorker-6 k me True
SpawnPoolWorker-7 l me True
SpawnPoolWorker-3 m me True
SpawnPoolWorker-1 n me True
SpawnPoolWorker-5 done
SpawnPoolWorker-5 o me True
SpawnPoolWorker-8 done
SpawnPoolWorker-8 p me True
SpawnPoolWorker-2 done
SpawnPoolWorker-4 done
SpawnPoolWorker-6 done
SpawnPoolWorker-1 done
SpawnPoolWorker-3 done
SpawnPoolWorker-7 done
SpawnPoolWorker-5 done
SpawnPoolWorker-8 done

由此可见,池中有8个工作程序,并查看了为每个作业传递的三个参数。由于有16个工作,因此在完成前8个工作之前,工人将承担另一个工作,直到全部完成。