如何在两个大型列表快速python之间进行比较

时间:2014-05-19 20:58:05

标签: python

我想知道是否有任何方法可以让这段代码更快地运行。它花了我47秒,它必须比较一切,而不仅仅是相同位置的元素。

pixels = list(mensagem)
arrayBits = []
for i in pixels:
    for j in tabela:
        if i == j[0]:
            arrayBits.append(j[1])

这里是漏洞代码,但我认为它花了这么长时间的唯一原因就是我问的那个。抱歉我的英语,我是葡萄牙语。

def codifica(mensagem, tabela, filename):
tamanho = np.shape(mensagem)
largura = tamanho[0]
if len(tamanho)==2:
    altura = tamanho[1]
else:
    altura = 0

pixels = list(mensagem)
arrayBits = []
for i in pixels:
    for j in tabela:
        if i == j[0]:
            arrayBits.append(j[1])

arraySemVirgulas = np.array(arrayBits).ravel() # tirar as virgulas
arrayJunto = ''.join(arraySemVirgulas) # juntar todos os bits
array = list(map(int,arrayJunto)) # coloca-los numa lista
count = 0
while(len(array)%8!=0):
    array.append(0)
    count += 1

array = np.array(array)
arrayNovo = array.reshape(-1,8)

decimais = convBi(arrayNovo)
array_char = ['' for i in range(len(decimais)+5)]
j = 2
for i in decimais:
    a = chr(i)
    array_char[j] = a
    j += 1

array_char[0] = str(count) 
array_char[1] = str(len(str(largura))) 
array_char[2] = str(len(str(altura)))
array_char[3] = str(largura) 
array_char[4] = str(altura)

ficheiro = open(filename,"wb")
for i in array_char:
    ficheiro.write(i)
ficheiro.close()

2 个答案:

答案 0 :(得分:1)

如果替换迭代

,这可能会更快
for i in pixels:
    for j in tabela:
        if i == j[0]:
            arrayBits.append(j[1])

使用字典查找

tabela_dict = dict(tabela)
for i in pixels:
    if i in tabela_dict :
        arrayBits.append(tabela_dict[i])

答案 1 :(得分:0)

使用基于set()dict()的容器可以使其在时间上线性而不是O(n ^ 2)。这应该加快速度:

编辑更简单,可能更快的版本:

import itertools

# set of 'keys' that exist in both
keys = set(pixels) & set(el[0] for el in tabela)
# and generator comprehension with fast lookup 
elements = (element[1] for element in tabela 
        if element[0] in keys)
# this will flatten inner lists and create a list with result:
result = list(itertools.chain.from_iterable(elements))

只有两个贯穿tabela,时间复杂度为O(n)。

如果pixels不是唯一的,并且tabela的相应值应该在每次出现像素时加倍,则应使用此值:

import itertools

# set of 'keys' that exist in both
keys = set(pixels) & set(el[0] for el in tabela)
# and generator comprehension with fast lookup 
elements = lambda key: tabela[key][1] if key in keys else []
# this will flatten inner lists and create a list with result:
flatten = itertools.chain.from_iterable
result = list(flatten(elements(pixel) for pixel in pixels))