我有一个相对较大的RGBA图像(转换为numpy),我需要替换列表中未出现的所有颜色。我该如何以pythonic快速方式做到这一点?
使用简单的迭代,我可以解决此问题,但是由于图像很大(2500 x 2500),此过程非常缓慢。
# Keep only these colors in the image, otherwise replace with (0,255,0,255)
palette = [[0,0,0,255],[0, 255, 0,255], [255, 0, 0,255], [128, 128, 128,255], [0, 0, 255,255], [255, 0, 255,255], [0, 255, 255,255], [255, 255, 255,255], [128, 128, 0,255], [0, 128, 128,255], [128, 0, 128,255]]
# Current slow solution with a 2500 x 2500 x 4 array (mask)
for z in range(mask.shape[0]):
for y in range(mask.shape[1]):
if (mask[z,y,:].tolist() not in palette):
mask[z, y] = (0,255,0,255)
每张图像的预期操作时间:不到半分钟
当前时间:2分钟
答案 0 :(得分:2)
绝对不是您应该查看的时间范围。这是使用broadcasting
的方法:
# palette.shape == (4,11)
palette = np.array(palette).transpose()
# sample a.shape == (2,2,4)
a= np.array([[[ 28, 231, 203, 235],
[255, 0, 0,255]],
[[ 50, 152, 36, 151],
[252, 43, 63, 25]]])
# mask
# all(2) force all channels to be equal
# any(-1) matches any color
mask = (a[:,:,:, None] == palette).all(2).any(-1)
# replace color
rep_color = np.array([0,255,0,255])
# np.where to the rescue:
ret = np.where(mask[:,:,None], a, rep_color[None,None,:])
示例:
成为
对于a = np.random.randint(0,256, (2500,2500,4))
,它需要:
每循环5.26 s±179毫秒(平均±标准偏差,共运行7次,每个循环1次)
更新:如果您将所有内容强制设为np.uint8
,则可以将频道合并到int32
中,并获得更快的速度:
a = np.random.randint(0,256, (2500,2500,4), dtype=np.uint8)
p = np.array(palette, dtype=np.uint8).transpose()
# zip the data into 32 bits
# could be even faster if we handle the memory directly
aa = a[:,:,0] * (2**24) + a[:,:,1]*(2**16) + a[:,:,2]*(2**8) + a[:,:,3]
pp = p[0]*(2**24) + p[1]*(2**16) + p[2]*(2**8) + p[3]
mask = (aa[:,:,None]==pp).any(-1)
ret = np.where(mask[:,:,None], a, rep_color[None,None,:])
需要:
每个循环1.34 s±29.7 ms(平均±标准偏差,共运行7次,每个循环1次)
答案 1 :(得分:2)
我喜欢pyvips。这是一个线程化的流式图像处理库,因此它速度很快并且不需要太多内存。
import sys
import pyvips
from functools import reduce
# Keep only these colors in the image, otherwise replace with (0,255,0,255)
palette = [[0,0,0,255], [0, 255, 0,255], [255, 0, 0,255], [128, 128, 128,255], [0, 0, 255,255], [255, 0, 255,255], [0, 255, 255,255], [255, 255, 255,255], [128, 128, 0,255], [0, 128, 128,255], [128, 0, 128,255]]
im = pyvips.Image.new_from_file(sys.argv[1], access="sequential")
# test our image against each sample ... bandand() will AND all image bands
# together, ie. we want pixels where they all match
masks = [(im == colour).bandand() for colour in palette]
# OR all the masks together to find pixels which are in the palette
mask = reduce((lambda x, y: x | y), masks)
# pixels not in the mask become [0, 255, 0, 255]
im = mask.ifthenelse(im, [0, 255, 0, 255])
im.write_to_file(sys.argv[2])
在此2015 i5笔记本电脑上使用2500x 2500像素PNG时,我会看到:
$ /usr/bin/time -f %M:%e ./replace-pyvips.py ~/pics/x.png y.png
55184:0.92
因此,最大内存为55mb,耗时为0.92s。
我尝试了Quang Hoang出色的numpy版本进行比较:
p = np.array(palette).transpose()
# mask
# all(2) force all channels to be equal
# any(-1) matches any color
mask = (a[:,:,:, None] == p).all(2).any(-1)
# replace color
rep_color = np.array([0,255,0,255])
# np.where to the rescue:
a = np.where(mask[:,:,None], a, rep_color[None,None,:])
im = Image.fromarray(a.astype('uint8'))
im.save(sys.argv[2])
在相同的2500 x 2500像素图像上运行:
$ /usr/bin/time -f %M:%e ./replace-broadcast.py ~/pics/x.png y.png
413504:3.08
最大内存为410MB,经过了3.1s。
如Hoang所说,可以通过比较uint32进一步加速这两个版本。
答案 2 :(得分:0)
使用此代码,我可以替换随机生成的2500 x 2500的图像,介于33到37秒之间。您使我的机器花了51到57秒才能执行的方法。
mask = np.random.rand(2500,2500,4)
mask = np.floor(mask * 255)
palette = np.array([[0,0,0,255],[0, 255, 0,255], [255, 0, 0,255], [128, 128, 128,255], [0, 0, 255,255], [255, 0, 255,255], [0, 255, 255,255], [255, 255, 255,255], [128, 128, 0,255], [0, 128, 128,255], [128, 0, 128,255]])
default = np.array([0,255,0,255])
for z in range(mask.shape[0]):
for y in range(mask.shape[1]):
if not mask[z,y,:] in palette:
mask[z,y,:] = default