已编辑:我的代码如下:
__author__ = 'feynman'
cimport cython
@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
def MC_Surface(volume, mc_vol):
Perm_area = {
"00000000": 0.000000,
"11111111": 0.000000,
...
...
"11100010": 1.515500,
"00011101": 1.515500
}
cdef int j, i, k
for k in range(volume.shape[2] - 1):
for j in range(volume.shape[1] - 1):
for i in range(volume.shape[0] - 1):
pattern = '%i%i%i%i%i%i%i%i' % (
volume[i, j, k],
volume[i, j + 1, k],
volume[i + 1, j, k],
volume[i + 1, j + 1, k],
volume[i, j, k + 1],
volume[i, j + 1, k + 1],
volume[i + 1, j, k + 1],
volume[i + 1, j + 1, k + 1])
mc_vol[i, j, k] = Perm_area[pattern]
return mc_vol
为了加快速度,它被修改为:
{
...
...
"11100010": 1.515500,
"00011101": 1.515500
}
keys = np.array(Perm_area.keys())
values = np.array(Perm_area.values())
starttime = time.time()
tmp_vol = GetPattern(volume)
print 'time to populate the key array: ', time.time() - starttime
cdef int i
starttime=time.time()
for i, this_key in enumerate(keys):
mc_vol[tmp_vol == this_key] = values[i]
print 'time for the loop: ', time.time() -starttime
return mc_vol
def GetPattern(volume):
a = (volume.astype(np.int)).astype(np.str)
output = a.copy() # Central voxel
output[:, :-1, :] = np.char.add(output[:, :-1, :], a[:, 1:, :]) # East
output[:-1, :, :] = np.char.add(output[:-1, :, :], a[1:, :, :]) # South
output[:-1, :-1, :] = np.char.add(output[:-1, :-1, :], a[1:, 1:, :]) # SouthEast
output[:, :, :-1] = np.char.add(output[:, :, :-1], a[:, :, 1:]) # Down
output[:, :-1, :-1] = np.char.add(output[:, :-1, :-1], a[:, 1:, 1:]) # DownEast
output[:-1, :, :-1] = np.char.add(output[:-1, :, :-1], a[1:, :, 1:]) # DownSouth
output[:-1, :-1, :-1] = np.char.add(output[:-1, :-1, :-1], a[1:, 1:, 1:]) # DownSouthEast
output = output[:-1, :-1, :-1]
del a
return output
对于尺寸为500 ^ 3的3D阵列,这需要更长的时间。这里,tmp_vol是3D字符串数组。例如:if say tmp_vol [0,0,0] =" 00000000"然后mc_vol [0,0,0] = 0.00000。或者,我可以摆脱mc_vol并写入tmp_vol [0,0,0] =" 00000000"然后tmp_vol [0,0,0] = 0.00000。
这里,for循环花了很多时间,我看到只使用了一个CPU。我尝试使用map和lambda并行映射它们但遇到了错误。我是python的新手,所以任何提示都会很棒。
答案 0 :(得分:0)
由于我不太了解你的代码,你说过#34;任何提示都很棒",我会给你一些一般性的建议。基本上你想加速for循环
for i, this_key in enumerate(keys):
你可以做的是将keys
数组分成几个部分,如下所示:
length = len(keys)
part1 = keys[:length/3]
part2 = keys[length/3: 2*length/3]
part3 = keys[2*length/3:]
然后处理子流程中的每个部分:
from concurrent.futures import ProcessPoolExecutor
def do_work(keys):
for i, this_key in enumerate(keys):
mc_vol[tmp_vol == this_key] = values[i]
with ProcessPoolExecutor(max_workers=3) as e:
e.submit(do_work, part1)
e.submit(do_work, part2)
e.submit(do_work, part3)
return mc_vol
就是这样。
答案 1 :(得分:0)
首先,字典查找需要非常接近的时间,尽管数组相等性检查是O(N)。因此,你应该遍历你的数组,而不是你的字典。
其次,使用nested list comprehension可以节省大量时间(将python循环更改为C循环)。
mc_vol = [[Perm_area[key] for key in row] for row in tmp_vol]
这为您提供了一个列表列表,因此在这种情况下您可以完全避免numpy。虽然如果你需要一个numpy数组,只需转换:
mc_vol = np.array(mc_vol)