Question

我有一个

形式的嵌套for循环

while x<lat2[0]:
    while y>lat3[1]:
        if (is_inside_nepal([x,y])):
            print("inside")
        else:
            print("not")
        y = y - (1/150.0)
    y = lat2[1]
    x = x + (1/150.0)
#here lat2[0] represents a large number

现在这通常需要 50秒才能执行。我已将此循环更改为多处理代码。

def v1find_coordinates(q):
  while not(q.empty()):

    x1 = q.get()
    x2 = x1 + incfactor
    while x1<x2:
        def func(x1): 
            while y>lat3[1]:
                if (is_inside([x1,y])):
                    print x1,y,"inside"
                else:
                    print x1,y,"not inside"
                y = y - (1/150.0)

        func(x1)
        y = lat2[1]
        x1 = x1 + (1/150.0)

incfactor = 0.7
xvalues = drange(x,lat2[0],incfactor)
#this drange function is to get list with increment factor as decimal
cores = mp.cpu_count()
q = Queue()
for i in xvalues:
    q.put(i)
for i in range(0,cores):
    p = Process(target = v1find_coordinates,args=(q,) )
    p.start()
    p.Daemon = True
    processes.append(p) 
for i in processes:
    print ("now joining")
    i.join()

此多处理代码也需要大约50秒的执行时间。这意味着两者之间没有时间差异。

我也尝试过使用游泳池。我也管理了块大小。我用Google搜索并搜索了其他stackoverflow。但找不到任何令人满意的答案。

我能找到的唯一答案是在流程管理中花费时间使结果相同。 如果这是原因，那么如何才能让多处理工作获得更快的结果呢？

用Python在C中实现会给出更快的结果吗？

我不期待大幅度的结果，但根据常识，可以看出在4核上运行应该比在1核中运行要快得多。但我得到了类似的结果。任何形式的帮助将不胜感激。

Answer 1

您似乎正在使用线程队列（来自队列导入队列）。这不能按预期工作，因为Process使用fork（）并将整个Queue克隆到每个工作进程

使用：

from multiprocessing import Queue

Python多处理速度问题

1 个答案: