我目前正在规范化python中的numpy数组,该数组是通过窗口拼接图像创建的,其中一个步幅创建了大约20K补丁。当前的规范化实现在我的运行时是一个很大的痛点,我试图用C扩展中的相同功能替换它。我希望看到社区有什么建议可以轻松简单地完成这项工作? 目前的运行时间仅为标准化部分的0.34s,我试图低于0.1s或更好。您可以看到使用view_as_windows创建补丁非常有效,我正在寻找类似于规范化的东西。请注意,您可以简单地注释/取消注释标记为“#---- Normalization”的行,以便自己查看不同实现的运行时。
以下是当前的实施:
import gc
import cv2, time
from libraries import GCN
from skimage.util.shape import view_as_windows
def create_imageArray(patch_list):
returnImageArray = numpy.zeros(shape=(len(patch_list), 1, 40, 60))
idx = 0
for patch, name, coords in patch_list:
imgArray = numpy.asarray(patch[:,:], dtype=numpy.float32)
imgArray = imgArray[numpy.newaxis, ...]
returnImageArray[idx] = imgArray
idx += 1
return returnImageArray
# print "normImgArray[0]:",normImgArray[0]
def NormalizeData(imageArray):
tempImageArray = imageArray
# Normalize the data in batches
batchSize = 25000
dataSize = tempImageArray.shape[0]
imageChannels = tempImageArray.shape[1]
imageHeight = tempImageArray.shape[2]
imageWidth = tempImageArray.shape[3]
for i in xrange(0, dataSize, batchSize):
stop = i + batchSize
print("Normalizing data [{0} to {1}]...".format(i, stop))
dataTemp = tempImageArray[i:stop]
dataTemp = dataTemp.reshape(dataTemp.shape[0], imageChannels * imageHeight * imageWidth)
#print("Performing GCN [{0} to {1}]...".format(i, stop))
dataTemp = GCN(dataTemp)
#print("Reshaping data again [{0} to {1}]...".format(i, stop))
dataTemp = dataTemp.reshape(dataTemp.shape[0], imageChannels, imageHeight, imageWidth)
#print("Updating data with new values [{0} to {1}]...".format(i, stop))
tempImageArray[i:stop] = dataTemp
del dataTemp
gc.collect()
return tempImageArray
start_time = time.time()
img1_path = "777628-1032-0048.jpg"
img_list = ["images/1.jpg", "images/2.jpg", "images/3.jpg", "images/4.jpg", "images/5.jpg"]
patchWidth = 60
patchHeight = 40
channels = 1
stride = patchWidth/6
multiplier = 1.31
finalImgArray = []
vaw_time = 0
norm_time = 0
array_time = 0
for im_path in img_list:
start = time.time()
baseFileWithExt = os.path.basename(im_path)
baseFile = os.path.splitext(baseFileWithExt)[0]
img = cv2.imread(im_path, cv2.IMREAD_GRAYSCALE)
nxtWidth = 800
nxtHeight = 1200
patchesList = []
for i in xrange(7):
img = cv2.resize(img, (nxtWidth, nxtHeight))
nxtWidth = int(nxtWidth//multiplier)
nxtHeight = int(nxtHeight//multiplier)
patches = view_as_windows(img, (patchHeight, patchWidth), stride)
cols = patches.shape[0]
rows = patches.shape[1]
patchCount = cols*rows
print "patchCount:",patchCount, " patches.shape:",patches.shape
returnImageArray = numpy.zeros(shape=(patchCount, channels, patchHeight, patchWidth))
idx = 0
for col in xrange(cols):
for row in xrange(rows):
patch = patches[col][row]
imageName = "{0}-patch{1}-{2}.jpg".format(baseFile, i, idx)
patchCoodrinates = (0, 1, 2, 3) # don't need these for example
patchesList.append((patch, imageName, patchCoodrinates))
# ---- Normalization inside 7 iterations <> Part 1
# imgArray = numpy.asarray(patch[:,:], dtype=numpy.float32)
# imgArray = patch.astype(numpy.float32)
# imgArray = imgArray[numpy.newaxis, ...] # Add a new axis for channel so goes from shape [40,60] to [1,40,60]
# returnImageArray[idx] = imgArray
idx += 1
# if i == 0: finalImgArray = returnImageArray
# else: finalImgArray = numpy.concatenate((finalImgArray, returnImageArray), axis=0)
vaw_time += time.time() - start
# ---- Normalizaion inside 7 iterations <> Part 2
# start = time.time()
# normImageArray = NormalizeData(finalImgArray)
# norm_time += time.time() - start
# print "returnImageArray.shape:", finalImgArray.shape
# ---- Normalization outside 7 iterations
start = time.time()
imgArray = create_imageArray(patchesList)
array_time += time.time() - start
start = time.time()
normImgArray = NormalizeData( imgArray )
norm_time += time.time() - start
print "len(patchesList):",len(patchesList)
total_time = (time.time() - start_time)/len(img_list)
print "\npatches_time per img: {0:.3f} s".format(vaw_time/len(img_list))
print "create imgArray per img: {0:.3f} s".format(array_time/len(img_list))
print "normalization_time per img: {0:.3f} s".format(norm_time/len(img_list))
print "total time per image: {0:.3f} s \n".format(total_time)
以下是GCN代码,以防您需要下载以使用它:http://pastebin.com/RdVMD2P3
有关GCN内部代码的详细信息
在高水平时,它取所有像素的平均值并将所有像素除以该平均值。因此,如果有一个看起来像这样的图像数组[1 2 3],那么平均值是2.因此我们将每个数除以2得到[0.5,1,1.5]。这就是规范化的作用。我忘了在上面的图片中突出显示mean = X.mean(axis = 1)。
注意: 如果您想知道我为什么要重新迭代并创建一个新的imgArray来规范化而不是在原始补丁创建中执行它,那么将数据传输保持在最低限度。我正在使用多进程库实现这一点,并且序列化数据需要花费LOOONG时间,因此尝试将数据序列化保持在最低限度(意味着从进程中传回少量数据)。我已经测量了在7个循环内部或外部进行的区别,并且注释在下面,所以我可以处理它。但是,如果您知道更快的实施,请告诉我。
在7个循环中创建imageArray的运行时:
patches_time per img: 0.560 s
normalization_time per img: 0.336 s
total time per image: 0.896 s
用于创建imageArray并在7次迭代之外进行规范化的运行时:
patches_time per img: 0.040 s
create imgArray per img: 0.146 s
normalization_time per img: 0.339 s
total time per image: 0.524 s
之前我没有看到过,但似乎创建数组也需要一些时间。