我想要通过一个大的tif堆栈+1500帧并提取每帧的局部最大值的坐标。下面的代码完成了这项工作,但对于大文件来说却非常慢。当在较小的比特(例如20帧)上运行时,每帧几乎立即完成 - 当在整个数据集上运行时,每帧都需要几秒钟。
运行更快代码的任何解决方案?我认为这是由于加载了大型tiff文件 - 但是最初只需要一次?
我有以下代码:
from pims import ImageSequence
from skimage.feature import peak_local_max
def cmask(index,array):
radius = 3
a,b = index
nx,ny = array.shape
y,x = np.ogrid[-a:nx-a,-b:ny-b]
mask = x*x + y*y <= radius*radius
return(sum(array[mask])) # number of pixels
images = ImageSequence('tryhard_red_small.tif')
frame_list = []
x = []
y = []
int_liposome = []
BG_liposome = []
for i in range(len(images[0])):
tmp_frame = images[0][i]
xy = pd.DataFrame(peak_local_max(tmp_frame, min_distance=8,threshold_abs=3000))
x.extend(xy[0].tolist())
y.extend(xy[1].tolist())
for j in range(len(xy)):
index = x[j],y[j]
int_liposome.append(cmask(index,tmp_frame))
frame_list.extend([i]*len(xy))
print "Frame: ", i, "of ",len(images[0])
features = pd.DataFrame(
{'lip_int':int_liposome,
'y' : y,
'x' : x,
'frame' : frame_list})
答案 0 :(得分:1)
您是否尝试过分析代码,比如ipython中的%prun
或%lprun
?那会告诉你减速发生的确切位置。
如果没有tif堆栈,我无法创建自己的版本,但我怀疑问题在于您使用列表来存储所有内容。每次执行追加或扩展时,python都必须分配更多内存。您可以先尝试获取最大总数,然后分配输出数组,然后重新运行以填充数组。像下面的东西
# run through once to get the count of local maxima
npeaks = (len(peak_local_max(f, min_distance=8, threshold_abs=3000))
for f in images[0])
total_peaks = sum(npeaks)
# allocate storage arrays and rerun
x = np.zeros(total_peaks, np.float)
y = np.zeros_like(x)
int_liposome = np.zeros_like(x)
BG_liposome = np.zeros_like(x)
frame_list = np.zeros(total_peaks, np.int)
index_0 = 0
for frame_ind, tmp_frame in enumerate(images[0]):
peaks = pd.DataFrame(peak_local_max(tmp_frame, min_distance=8,threshold_abs=3000))
index_1 = index_0 + len(peaks)
# copy the data from the DataFrame's underlying numpy array
x[index_0:index_1] = peaks[0].values
y[index_0:index_1] = peaks[1].values
for i, peak in enumerate(peaks, index_0):
int_liposome[i] = cmask(peak, tmp_frame)
frame_list[index_0:index_1] = frame_ind
# update the starting index
index_0 = index_1
print "Frame: ", frame_ind, "of ",len(images[0])