从包含RGB值的文件中分割R,G和B值的有效方法(无NumPy)

时间:2014-06-23 16:43:43

标签: python regex python-2.7

我有一个包含RGB值的文件。像,

示例图像Data.txt文件

每行包含由空格分隔的三元组(如255,255,255)。
每个三元组都有三个逗号分隔的整数。这些整数对应于R('RED'),G('GREEN')和B('BLUE')值。所有整数都小于256。

255,255,255 250,250,250 254,254,254 250,250,250 
255,255,255 253,253,253 255,255,255 255,255,255 
251,251,251 247,247,247 251,251,251 250,250,250
195,195,195 191,191,191 195,195,195 195,195,195
255,255,255 253,253,253 254,254,254 255,255,255 
255,255,255 254,254,254 239,239,239 240,240,240
238,238,238 254,254,254 255,255,255 255,255,255

处理后的输出应如下所示:
RED = ['255','250','254','250','255','253','255',............,'254','255','255']
GREEN = ['255','250','254','250','255','253','255',............,'254','255','255']
蓝色= ['255','250','254','250','255','253','255',............,'254','255','255']
RGB_Nx3_MATRIX = [['255','255','255'],['250','250','250'],['254','254','254'].....['255','255','255']]

我的代码工作正常。

import re

file_object = open('Image Data.txt','r') 

RED_VECTOR = []         #SEQUENTIALLY STORES ALL 'R' VALUES
GREEN_VECTOR = []       #SEQUENTIALLY STORES ALL 'G' VALUES
BLUE_VECTOR = []        #SEQUENTIALLY STORES ALL 'B' VALUES

RGB_Nx3_MATRIX = []     #Nx3 MATRIX i.e. ['R','G','B'] N times

for line in file_object:
    SPACE_split_LIST = line.split()

    for pixel in SPACE_split_LIST:
        RGB = re.findall(r'\,?(\d+)\,?',pixel)
        RED_VECTOR += [RGB[0]]
        GREEN_VECTOR += [RGB[1]]
        BLUE_VECTOR += [RGB[2]]

        RGB_Nx3_MATRIX += [RGB]




#RESULTS

#print RED_VECTOR
#print GREEN_VECTOR
#print BLUE_VECTOR

#print "------------------"

#print RGB_Nx3_MATRIX

我在找什么?

我需要一种更好,更有效的方法来做到这一点。我想避免使用两个for循环。

3 个答案:

答案 0 :(得分:3)

你可以避免使用正则表达式

f =open('Image Data.txt','r')                 

R=[]                                 
G=[]                                 
B=[]                                 
for line in f:                       
    for color_set in line.split():       
        r,g,b = color_set.split(',')     
        R+=[r]                       
        G+=[g]                       
        B+=[b]                       

print B

<强>输出

['255', '250', '254', '250', '255', '253', '255', '255', '251', '247', '251', '250', '195', '191', '195', '195', '255', '253', '254', '255', '255', '254', '239', '240', '238', '254', '255', '255']

答案 1 :(得分:1)

如果你主要对矩阵感兴趣,你几乎可以在一行中做到这一点:

with open('Image Data.txt','r') as file_h:
    rgb_matrix = [triple.split(',') for line in file_h for triple in line.strip().split()]

应该相当有效。您还可以通过另一个循环将其扩展为将它们转换为整数。

with open('Image Data.txt','r') as file_h:
    rgb_matrix = [[int(num) for num in triple.split(',')] for line in file_h for triple in line.strip().split()]

如果你真的需要单独的颜色,你可以轻松地将它们作为:

red = [row[0] for row in rgb_matrix]
green = [row[1] for row in rgb_matrix]
blue = [row[2] for row in rgb_matrix]

答案 2 :(得分:0)

为什么要避免使用两个for循环? for循环本质上不是低效的。但是,对每一行(例如re.findall)进行函数调用可能会变得非常低效。

当处理大文件或处理像素时,最好坚持使用简单的函数和算术而不是昂贵的函数调用。您可能想要做的是以下内容:

for line in file:
    split = line.split(' ')
    for s in split:
        r,g,b = s.split(',')
        r_vector.append(r)
        g_vector.append(g)
        b_vector.append(b.split('\')[0]) <<<<Keep in mind, every line will have a '\n' newline char

编辑:感谢@Ashoka Lella指出每行有多个rgb集。