Question

我一直在研究多处理，并在网站上找到了它的一个例子。但是，当我尝试在我的MacBook视网膜上运行该示例时，没有任何反应。以下是例子：

import random
import multiprocessing


def list_append(count, id, out_list):
 """
 Creates an empty list and then appends a 
 random number to the list 'count' number
 of times. A CPU-heavy operation!
 """
 for i in range(count):
     out_list.append(random.random())

if __name__ == "__main__":
 size = 10000000 # Number of random numbers to add
 procs = 2 # Number of processes to create

# Create a list of jobs and then iterate through
# the number of processes appending each process to
# the job list 
jobs = []
for i in range(0, procs):
    out_list = list()
    process = multiprocessing.Process(target=list_append, 
         args=(size, i, out_list))
    jobs.append(process)

# Start the processes (i.e. calculate the random number lists)      
for j in jobs:
    j.start()

# Ensure all of the processes have finished
for j in jobs:
j.join()

print ("List processing complete.")

事实证明，我在'list_append＆＃39;中添加了一个打印声明。函数，没有打印，所以问题实际上不是j.join()而是j.start()位。

Answer 1

使用L: for { tt := z.Next() switch { case tt == html.ErrorToken: break L case tt == html.StartTagToken: t := z.Token() isAnchor := t.Data == "a" if !isAnchor { continue } ok, url := getHref(t) if !ok { continue } if strings.Contains(url, "somestring") { urls = append(urls, url) } } }创建进程时，可以准备一个子函数，以异步方式在不同的进程中运行。当您调用multiprocessing.Process方法时，计算开始。 join方法等待计算完成。因此，如果您只是启动该过程并且不等待完成（或start），则不会发生任何事情，因为当您的程序退出时，该过程将被终止。

此处有一个问题是您没有在join中使用可共享的对象。当您使用公共multiprocessing时，每个进程将在内存中使用不同的列表。当进程退出并且主列表为空时，将清除本地进程。如果您希望能够在数据之间交换流程，则应使用list()：

multiprocessing.Queue

请注意，如果您无法正确计算import random import multiprocessing def list_append(count, id, out_queue): """ Creates an empty list and then appends a random number to the list 'count' number of times. A CPU-heavy operation! """ for i in range(count): out_queue.put((id, random.random())) if __name__ == "__main__": size = 10000 # Number of random numbers to add procs = 2 # Number of processes to create # Create a list of jobs and then iterate through # the number of processes appending each process to # the job list jobs = [] q = multiprocessing.Queue() for i in range(0, procs): process = multiprocessing.Process(target=list_append, args=(size, i, q)) process.start() jobs.append(process) result = [] for k in range(procs*size): result += [q.get()] # Wait for all the processes to finish for j in jobs: j.join() print("List processing complete. {}".format(result))中发回的结果数，此代码可能会非常容易挂起。
如果您尝试检索太多结果，out_queue将等待永远不会出现的额外结果。如果您未从q.get检索所有结果，则您的流程会冻结，因为q已满，并且out_queue将不会返回。因此，您的流程将永远不会退出，您将无法加入它们如果您的计算是独立的，我强烈建议您查看更高级别的工具，如out_queue.put或更强大的第三方库，如Pool，因为它会为您处理这些方面。（see this answer for some insights on Process vs Pool/joblib）

我实际上减少了数字joblib，因为如果您尝试在size中放置许多对象，程序会变慢。如果你需要传递很多小对象，请尝试一次性传递所有这些：

Queue

当我在使用多处理时将process.join放入脚本时，Python崩溃了

1 个答案: