我将数据以块的形式传输到一个类中。对于每个数据块,在同一ProcessPoolExecutor上执行两种不同类型的np.convolve()。调用的卷积类型由返回变量确定。
必须保持数据的顺序,因此每个未来都有一个相关的序列号。 output
函数强制只返回来自连续期货的数据(下面未显示)。根据我的理解,我正在调用ProcessPoolExecutor.shutdown()
函数,但我仍然得到IOError
:
错误是:
$ python processpoolerror.py
ran 5000000 samples in 3.70395112038 sec: 1.34990982265 Msps
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/queues.py", line 268, in _feed
send(obj)
IOError: [Errno 32] Broken pipe
对不起它有点长,但是我在保留错误的同时尽可能地修剪了这个课程。在我的计算机Ubuntu 16.04.2
上使用Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
配对的代码始终会出现此错误。在此代码的非修剪版本中,Broken管道占25%的时间。
如果您将行78
编辑为True
,并在执行期间打印,则不会抛出错误。如果减少行100
上的数据量,则不会引发错误。我在这做错了什么?感谢。
import numpy as np
from concurrent.futures import ProcessPoolExecutor
import time
def _do_xcorr3(rev_header, packet_chunk, seq):
r1 = np.convolve(rev_header, packet_chunk, 'full')
return 0, seq, r1
def _do_power3(power_kernel, packet_chunk, seq):
cp = np.convolve(power_kernel, np.abs(packet_chunk) ** 2, 'full')
return 1, seq, cp
class ProcessPoolIssues():
## Constructor
# @param chunk_size how many samples to feed in during input() stage
def __init__(self,header,chunk_size=500,poolsize=5):
self.chunk_size = chunk_size ##! How many samples to feed
# ProcessPool stuff
self.poolsize = poolsize
self.pool = ProcessPoolExecutor(poolsize)
self.futures = []
# xcr stage stuff
self.results0 = []
self.results0.append((0, -1, np.zeros(chunk_size)))
# power stage stuff
self.results1 = []
self.results1.append((1, -1, np.zeros(chunk_size)))
self.countin = 0
self.countout = -1
def shutdown(self):
self.pool.shutdown(wait=True)
## Returns True if all data has been extracted for given inputs
def all_done(self):
return self.countin == self.countout+1
## main function
# @param packet_chunk an array of chunk_size samples to be computed
def input(self, packet_chunk):
assert len(packet_chunk) == self.chunk_size
fut0 = self.pool.submit(_do_xcorr3, packet_chunk, packet_chunk, self.countin)
self.futures.append(fut0)
fut1 = self.pool.submit(_do_power3, packet_chunk, packet_chunk, self.countin)
self.futures.append(fut1)
self.countin += 1
# loops through thread pool, copying any results from done threads into results0/1 (and then terminating them)
def cultivate_pool(self):
todel = []
for i, f in enumerate(self.futures):
# print "checking", f
if f.done():
a, b, c = f.result()
if a == 0:
self.results0.append((a,b,c)) # results from one type of future
elif a == 1:
self.results1.append((a,b,c)) # results from another type of future
todel.append(i)
# now we need to remove items from futures that are done
# we need do it in reverse order so we remove items from the end first (thereby not affecting indices as we go)
for i in sorted(todel, reverse=True):
del self.futures[i]
if False: # change this to true and error goes away
print "deleting future #", i
# may return None
def output(self):
self.cultivate_pool() # modifies self.results list
# wait for both results to be done before clearing
if len(self.results0) and len(self.results1):
del self.results0[0]
del self.results1[0]
self.countout += 1
return None
def testRate():
chunk = 500
# a value of 10000 will throw: IOError: [Errno 32] Broken pipe
# smaller values like 1000 do not
din = chunk * 10000
np.random.seed(666)
search = np.random.random(233) + np.random.random(233) * 1j
input = np.random.random(din) + np.random.random(din) * 1j
pct = ProcessPoolIssues(search, chunk, poolsize=8)
st = time.time()
for x in range(0, len(input), chunk):
slice = input[x:x + chunk]
if len(slice) != chunk:
break
pct.input(slice)
pct.output()
while not pct.all_done():
pct.output()
ed = time.time()
dt = ed - st
print "ran", din, "samples in", dt, "sec:", din / dt / 1E6, "Msps"
pct.shutdown()
if __name__ == '__main__':
testRate()
答案 0 :(得分:1)
这可能正在发生,因为当你尝试一次发送更大的块时,你超过了管道的缓冲区大小。
def _do_xcorr3(rev_header, packet_chunk, seq):
r1 = np.convolve(rev_header, packet_chunk, 'full')
return 0, seq, r1
def _do_power3(power_kernel, packet_chunk, seq):
cp = np.convolve(power_kernel, np.abs(packet_chunk) ** 2, 'full')
return 1, seq, cp
值r1和cp非常大,因为你正在使用块的平方进行卷积。
因此,当您尝试使用较大的块大小运行时,IO管道的缓冲区无法处理它。请参阅this以获得更清晰的理解。
至于问题的第二部分,
if False: # change this to true and error goes away
print "deleting future #", i
在py3 docs中找到了这个:
16.2.4.4。重入 二进制缓冲对象(BufferedReader,BufferedWriter,BufferedRandom和BufferedRWPair的实例)不可重入。虽然在正常情况下不会发生可重入调用,但它们可能来自在信号处理程序中执行I / O.如果某个线程试图重新输入它正在访问的缓冲对象,则会引发RuntimeError。请注意,这不会禁止其他线程进入缓冲对象。 上面隐式扩展到文本文件,因为open()函数将一个缓冲的对象包装在TextIOWrapper中。这包括标准流,因此也会影响内置函数print()。