更快速地循环像素的方法

时间:2016-09-17 15:06:48

标签: python pandas

我循环浏览某些照片的每个像素并存储RGB数字,再加上像素的位置。

这是我当前的循环,看起来它可能是实际可能的非常慢的版本 - 我对pandas有一些熟悉,因此我使用数据帧将数据存储在循环中。

我应该探索哪些途径才能提高效率?废弃df的想法并改为使用香草列表?

import os
import pandas as pd
import numpy as np

from PIL import ImageColor
from PIL import Image
IMAGE_PATH = 'P:/image_files/'

def loopThroughPixels():
    im = Image.open(IMAGE_PATH + 'small_orange_white.png') 
    w,h = im.size

    df = pd.DataFrame({
        'pixLocation' : [(0,0)],
        'pixRGB' : [im.getpixel((0,0))]
        })

    i = 0
    for x in range(w):
        for y in range(h):
                i = i + 1
                new_record = pd.DataFrame({
                    'pixLocation' : [(x,y)],
                    'pixRGB' : [im.getpixel((x,y))]
                    })
                df = pd.concat([df,new_record])
                del new_record

    df.reset_index(inplace = True)
    df.drop("index", axis = 1, inplace = True)

请注意

Nehal的回答很快得出错误答案。让我来说明一下:

import pandas as pd
import itertools
from PIL import ImageColor
from PIL import Image

# create  very small image to illustrate
i = Image.new('RGB',(2,2))
i.putpixel((0,0), (1,1,1))
i.putpixel((1,0), (2,2,2))
i.putpixel((0,1), (3,3,3))
i.putpixel((1,1), (4,4,4))

# run the algorithm:
def loopThroughPixels():

    im = i
    w,h = im.size

    pixLocation = list(itertools.product(range(h), range(w)))
    pixRGB = list(im.getdata())

    df = pd.DataFrame({'pixLocation': pixLocation, 'pixRGB': pixRGB})

    return df

结果:

enter image description here

将结果比作初始的putpixel声明 - 算法是坐标(0,1)是(2,2,2)但是应该是(3,3,3)。

1 个答案:

答案 0 :(得分:3)

更快的方法是:

import pandas as pd
from PIL import ImageColor
from PIL import Image
IMAGE_PATH = 'P:/image_files/'

def loopThroughPixels():
    im = Image.open(IMAGE_PATH + 'small_orange_white.png')
    w,h = im.size

    pixLocation = [(y, x) for x in range(h) for y in range(w)]
    pixRGB = list(im.getdata())

    df = pd.DataFrame({'pixLocation': pixLocation, 'pixRGB': pixRGB})

    return df

loopThroughPixels()

在描述文件中......

PNG image data, 1399 x 835, 8-bit/color RGBA, non-interlaced

......花了:

In [1]: %timeit loopThroughPixels()
1 loop, best of 3: 324 ms per loop

更新(与评论中的选项进行比较):

In [14]: w = 1399

In [15]: h = 835

In [16]: [(y, x) for x in range(h) for y in range(w)] == list(zip(list(range(w))*h, sorted(list(range(h))*w)))
Out[16]: True

In [17]: %timeit  [(y, x) for x in range(h) for y in range(w)]
10 loops, best of 3: 107 ms per loop

In [18]: %timeit  list(zip(list(range(w))*h, sorted(list(range(h))*w)))
1 loop, best of 3: 207 ms per loop