Question

我正试图想出一种方法让线程在不干扰的情况下在同一个目标上工作。在这种情况下，我使用4个线程来添加0到90,000之间的每个数字。这个代码运行但它几乎立即结束（运行时：0.00399994850159秒）并且只输出0.最初我想用全局变量做这个但我担心线程互相干扰（即两个线程加倍的可能性很小）由于读/写的奇怪时间而计数或跳过一个数字）。所以我事先分配了工作量。如果有更好的方法，请分享。这是我尝试获得多线程体验的简单方式。感谢

import threading
import time

start_time = time.time()

tot1 = 0
tot2 = 0
tot3 = 0
tot4 = 0

def Func(x,y,tot):
    tot = 0
    i = y-x
    while z in range(0,i):
        tot = tot + i + z

# class Tester(threading.Thread):
#   def run(self):
#       print(n)

w = threading.Thread(target=Func, args=(0,22499,tot1))
x = threading.Thread(target=Func, args=(22500,44999,tot2))
y = threading.Thread(target=Func, args=(45000,67499,tot3))
z = threading.Thread(target=Func, args=(67500,89999,tot4))

w.start()
x.start()
y.start()
z.start()

w.join()
x.join()
y.join()
z.join()

# while (w.isAlive() == False | x.isAlive() == False | y.isAlive() == False | z.isAlive() == False): {}

total = tot1 + tot2 + tot3 + tot4

print total

print("--- %s seconds ---" % (time.time() - start_time))

Answer 1

你有一个错误，使这个程序几乎立即结束。查看while z in range(0,i):中的Func。 z在函数中没有定义，只有幸运（真的是运气不好），你碰巧有一个掩盖问题的全局变量z = threading.Thread(target=Func, args=(67500,89999,tot4))。您正在测试线程对象是否在整数列表中......而不是！

下一个问题是全局变量。首先，使用单个全局变量并不是线程安全的，这是绝对正确的。线程会混淆彼此的计算。但是你误解了全局变量的运作方式。执行threading.Thread(target=Func, args=(67500,89999,tot4))时，python将tot4当前引用的对象传递给函数，但函数不知道它来自哪个全局。您只更新局部变量tot并在函数完成时将其丢弃。

解决方案是使用全局容器来保存计算，如下例所示。不幸的是，这实际上比在一个线程中完成所有工作要慢。 python全局解释器锁（GIL）一次只允许1个线程运行，只会减慢纯python中实现的CPU密集型任务。

您可以查看multiprocessing模块，将其拆分为多个进程。如果运行计算的成本与启动流程和传递数据的成本相比较大，那么这种方法很有效。

以下是您的示例的工作副本：

import threading
import time

start_time = time.time()

tot = [0] * 4

def Func(x,y,tot_index):
    my_total = 0
    i = y-x
    for z in range(0,i):
        my_total = my_total + i + z
    tot[tot_index] = my_total

# class Tester(threading.Thread):
#   def run(self):
#       print(n)

w = threading.Thread(target=Func, args=(0,22499,0))
x = threading.Thread(target=Func, args=(22500,44999,1))
y = threading.Thread(target=Func, args=(45000,67499,2))
z = threading.Thread(target=Func, args=(67500,89999,3))

w.start()
x.start()
y.start()
z.start()

w.join()
x.join()
y.join()
z.join()

# while (w.isAlive() == False | x.isAlive() == False | y.isAlive() == False | z.isAlive() == False): {}

total = sum(tot)


print total

print("--- %s seconds ---" % (time.time() - start_time))

Answer 2

您可以传入一个可变对象，您可以使用标识符添加结果，例如dict或仅list和append()结果，例如：

import threading

def Func(start, stop, results):
    results.append(sum(range(start, stop+1)))

rngs = [(0, 22499), (22500, 44999), (45000, 67499), (67500, 89999)]
results = []
jobs = [threading.Thread(target=Func, args=(start, stop, results)) for start, stop in rngs]

for j in jobs:
    j.start()

for j in jobs:
    j.join()

print(sum(results))
# 4049955000
# 100 loops, best of 3: 2.35 ms per loop

Answer 3

正如其他人所说，你可以查看multiprocessing，以便将工作分成多个可以并行运行的不同进程。假设在进程之间没有大量数据传递，这将特别有利于CPU密集型任务。

以下是使用documentation：

的相同功能的简单实现

from multiprocessing import Pool

POOL_SIZE = 4
NUMBERS = 90000

def func(_range):
    tot = 0
    for z in range(*_range):
        tot += z

    return tot

with Pool(POOL_SIZE) as pool:
    chunk_size = int(NUMBERS / POOL_SIZE)
    chunks = ((i, i + chunk_size) for i in range(0, NUMBERS, chunk_size))
    print(sum(pool.imap(func, chunks)))

上面chunks是一个生成相同范围的生成器，它们在原始版本中是硬编码的。它被赋予multiprocessing，它与标准map的工作方式相同，只是它在池中的进程中执行该函数。

关于multiprocessing的鲜为人知的事实是，您可以使用未记录的multiprocessing.pool.ThreadPool轻松地将代码转换为使用线程而不是进程。要将上面的示例转换为使用线程，只需将import更改为：

from multiprocessing.pool import ThreadPool as Pool

Python线程可以在同一个进程上工作吗？

3 个答案: