Question

我在python2.7中使用多处理过程来创建过程，如下所示：

p1 = Process(target=build_sql_s, args=(Scopes,))
p2 = Process(target=build_sql_o, args=(Orders,))
p1.start()
p2.start()
p1.join()
p2.join()

Scopes和Orders有两个list var，build_sql_s和build_sql_o是functions。当我运行此程序时，cpu的{{1}}为p1，70%的cpu和内存始终为p2 0%'我的工作是在p1的计算机上完成的...... 为什么？多处理过程不应该使用计算机的不同核心吗？

Answer 1

核心选择和流程分配是操作系统的问题，而不是代码。多线程并不意味着你将执行的代码分离到CPU，也在多处理模块中进行多处理。

Answer 2

我没有使用显式的.start()和.join()，但我使用multiprocessing.Pool()获得了令人满意的结果。 Pool()设置一个工作进程池，默认情况下每个处理器核心一个，然后.map()方法使它们工作。我建议你试一试。

这是我刚刚编写的一个简单程序，用于演示multiprocessing.Pool().map()的使用。

在Python 2.7和Python 3.3下测试。

import multiprocessing as mp
import time

def is_odd(n):
    return bool(n%2)

def hailstone(n):
    """
    Compute the Hailstone sequence, as described in:
    http://en.wikipedia.org/wiki/Collatz_conjecture
    """
    steps = 0
    while n > 1:
        steps += 1
        if is_odd(n):
            n = 3*n + 1
        else:
            n = n//2
    return steps


def slow(limit):
    steps = sum(hailstone(n) for n in range(limit))
    #print("n: {}  steps: {}".format(n, steps))
    return (limit, steps)

LIMIT = 2000

start = time.time()
results0 = [slow(n) for n in range(LIMIT)]
stop = time.time()
elapsed0 = stop - start

print("Single-threaded time: {:.2f} seconds".format(elapsed0))

# note we are including time to set up the Pool() in the elapsed time
start = time.time()
p = mp.Pool()
results1 = p.map(slow, range(LIMIT))
stop = time.time()
elapsed1 = stop - start

results1.sort()

print("Multiprocessing time: {:.2f} seconds".format(elapsed1))

assert results0 == results1

在multiprocessing.Pool()中运行时，我几乎看到了4倍的加速。在AMD FX-8350上测试（内部有4个“Bulldozer模块”）。

Answer 3

嗯，我想我现在就看到了你的问题。

您必须知道处理器不共享相同的内存。因此，当您创建并启动一个新进程，为其提供（实际）大型列表时，首先会复制该列表。这可能需要一段时间......

一种可能性，如果您的数据（=列表）是只读的，则更多是使用共享内存。看看那里：http://docs.python.org/3.3/library/multiprocessing.html#module-multiprocessing.sharedctypes with multiprocessing.Array

关于多处理过程

3 个答案: