Question

我输入了一个带字符串的元组列表和一个整数列表。整数从1到n，它们最多出现一次：

l = [('red', [0,2,5]),
     ('yellow', [1,4]),
     ('red', [6])]

我想创建一个n字符串列表，其中如果索引出现在列表之一中，则它的值将是对应的字符串，如果不出现，则将应用默认值，例如white。

这是预期的输出：

result = ['red', 'yellow', 'red', 'white', 'yellow', 'red', 'red']

这是我的代码，可以正常工作，但是我想知道是否有更快的方法：

result = ['white'] * n

for t in l:
    for i in t[1]:
        result[i] = t[0]

编辑：

我忘了说n大约是300。

Answer 1

对于python中的所有“是否有更快的方法可以做到这一点”（我相信，在大多数语言中也是如此），答案是对其进行度量，然后您就会知道。

我将到目前为止提出的答案中的代码用于代码并进行计时：

import numpy as np
import timeit

n = 7
l = [('red', [0,2,5]),
     ('yellow', [1,4]),
     ('red', [6])]

def OP_approach():
    result = ['white'] * n
    for t in l:
        for i in t[1]:
            result[i] = t[0]
    return result

def yatu_approach():
    d = {j:i[0] for i in l for j in i[1]}
    return [d.get(i, 'white') for i in range(len(d)+1)]

def blue_note_approach():
    x = np.empty(7, dtype='<U5')
    x.fill('white')
    for a, b in l:
        x[b] = a
    return x

timeit.timeit(OP_approach, number=10000)
timeit.timeit(yatu_approach, number=10000)
timeit.timeit(blue_note_approach, number=10000)

令我惊讶的是，这是我的机器（arm64板）上的结果：

>>> timeit.timeit(OP_approach, number=10000)
0.033418309001717716
>>> timeit.timeit(yatu_approach, number=10000)
0.10994336503790691
>>> timeit.timeit(blue_note_approach, number=10000)
0.3608954470255412

那么，对于给定的样本数据而言，简单的双循环似乎比其他两个选项都快。但是请记住，正如@yatu指出的那样，这些算法的缩放比例非常不同，选择哪种方法取决于要解决的问题的预期大小。

Answer 2

仅通过使用numpy

import numpy as np
x = np.empty(7, dtype='<U6')
x.fill('white')

for a, b in l:
    x[b] = a

其中U6表示长度为6（最多）的unicode字符串

Answer 3

from operator import itemgetter

l = [('red', [0,2,5]),
     ('yellow', [1,4]),
     ('red', [6])]
# get len of result
n = max(l, key = itemgetter(1))[1] 

# initialize the result list
result = ['white'] * 7

for t in l:
    for i in t[1]:
        result[i] = t[0]

输出：

result = ['red', 'yellow', 'red', 'white', 'yellow', 'red', 'red']

根据元组列表（值，索引）创建列表

3 个答案: