我循环浏览某些照片的每个像素并存储RGB数字,再加上像素的位置。
这是我当前的循环,看起来它可能是实际可能的非常慢的版本 - 我对pandas有一些熟悉,因此我使用数据帧将数据存储在循环中。
我应该探索哪些途径才能提高效率?废弃df的想法并改为使用香草列表?
import os
import pandas as pd
import numpy as np
from PIL import ImageColor
from PIL import Image
IMAGE_PATH = 'P:/image_files/'
def loopThroughPixels():
im = Image.open(IMAGE_PATH + 'small_orange_white.png')
w,h = im.size
df = pd.DataFrame({
'pixLocation' : [(0,0)],
'pixRGB' : [im.getpixel((0,0))]
})
i = 0
for x in range(w):
for y in range(h):
i = i + 1
new_record = pd.DataFrame({
'pixLocation' : [(x,y)],
'pixRGB' : [im.getpixel((x,y))]
})
df = pd.concat([df,new_record])
del new_record
df.reset_index(inplace = True)
df.drop("index", axis = 1, inplace = True)
请注意
Nehal的回答很快得出错误答案。让我来说明一下:
import pandas as pd
import itertools
from PIL import ImageColor
from PIL import Image
# create very small image to illustrate
i = Image.new('RGB',(2,2))
i.putpixel((0,0), (1,1,1))
i.putpixel((1,0), (2,2,2))
i.putpixel((0,1), (3,3,3))
i.putpixel((1,1), (4,4,4))
# run the algorithm:
def loopThroughPixels():
im = i
w,h = im.size
pixLocation = list(itertools.product(range(h), range(w)))
pixRGB = list(im.getdata())
df = pd.DataFrame({'pixLocation': pixLocation, 'pixRGB': pixRGB})
return df
将结果比作初始的putpixel声明 - 算法是坐标(0,1)是(2,2,2)但是应该是(3,3,3)。
答案 0 :(得分:3)
更快的方法是:
import pandas as pd
from PIL import ImageColor
from PIL import Image
IMAGE_PATH = 'P:/image_files/'
def loopThroughPixels():
im = Image.open(IMAGE_PATH + 'small_orange_white.png')
w,h = im.size
pixLocation = [(y, x) for x in range(h) for y in range(w)]
pixRGB = list(im.getdata())
df = pd.DataFrame({'pixLocation': pixLocation, 'pixRGB': pixRGB})
return df
loopThroughPixels()
在描述文件中......
PNG image data, 1399 x 835, 8-bit/color RGBA, non-interlaced
......花了:
In [1]: %timeit loopThroughPixels()
1 loop, best of 3: 324 ms per loop
更新(与评论中的选项进行比较):
In [14]: w = 1399
In [15]: h = 835
In [16]: [(y, x) for x in range(h) for y in range(w)] == list(zip(list(range(w))*h, sorted(list(range(h))*w)))
Out[16]: True
In [17]: %timeit [(y, x) for x in range(h) for y in range(w)]
10 loops, best of 3: 107 ms per loop
In [18]: %timeit list(zip(list(range(w))*h, sorted(list(range(h))*w)))
1 loop, best of 3: 207 ms per loop