TL;博士

Question

所以我有一个像这样的元组列表：

[(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]

我希望这个列表的元组值等于某个元组。

因此，如果我search(53)，它将返回2

的索引值

有一种简单的方法吗？

Answer 1

[i for i, v in enumerate(L) if v[0] == 53]

Answer 2

您可以使用list comprehension：

>>> a = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]
>>> [x[0] for x in a]
[1, 22, 53, 44]
>>> [x[0] for x in a].index(53)
2

Answer 3

TL;博士

generator expression可能是解决您问题的最高效，最简单的解决方案：

l = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]

result = next((i for i, v in enumerate(l) if v[0] == 53), None)
# 2

解释

有几个答案通过列表推导为这个问题提供了一个简单的解决方案。虽然这些答案完全正确，但它们并非最佳。根据您的使用情况，进行一些简单的修改可能会有很大的好处。

我在这个用例中使用列表解析时遇到的主要问题是，整个列表将被处理，尽管您只想找到 1个元素。 / p>

Python提供了一个简单的构造，在这里是理想的。它被称为generator expression。这是一个例子：

# Our input list, same as before
l = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]

# Call next on our generator expression.
next((i for i, v in enumerate(l) if v[0] == 53), None)

我们可以期望这个方法在我们简单的例子中与列表推导基本相同，但是如果我们使用更大的数据集呢？这就是使用发电机方法的优势发挥作用的地方。我们将使用您的现有列表作为迭代，而不是构建新列表，并使用next()从我们的生成器中获取第一个项目。

让我们看一下这些方法在一些较大的数据集上的表现方式。这些是大型列表，由10000000 + 1个元素组成，我们的目标位于开头（最佳）或结束（最差）。我们可以使用以下列表理解来验证这两个列表是否同等地执行：

列表理解

“最坏情况”

worst_case = ([(False, 'F')] * 10000000) + [(True, 'T')]
print [i for i, v in enumerate(worst_case) if v[0] is True]

# [10000000]
#          2 function calls in 3.885 seconds
#
#    Ordered by: standard name
#
#    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
#         1    3.885    3.885    3.885    3.885 so_lc.py:1(<module>)
#         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

“最佳案例”

best_case = [(True, 'T')] + ([(False, 'F')] * 10000000)
print [i for i, v in enumerate(best_case) if v[0] is True]

# [0]
#          2 function calls in 3.864 seconds
#
#    Ordered by: standard name
#
#    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
#         1    3.864    3.864    3.864    3.864 so_lc.py:1(<module>)
#         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

生成器表达式

这是我对发电机的假设：我们会看到发电机在最好的情况下会表现得更好，但在最坏的情况下也是如此。这种性能提升主要是因为生成器被懒惰地评估，这意味着它只会计算产生值所需的内容。

最坏情况

# 10000000
#          5 function calls in 1.733 seconds
#
#    Ordered by: standard name
#
#    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
#         2    1.455    0.727    1.455    0.727 so_lc.py:10(<genexpr>)
#         1    0.278    0.278    1.733    1.733 so_lc.py:9(<module>)
#         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
#         1    0.000    0.000    1.455    1.455 {next}

最佳案例

best_case  = [(True, 'T')] + ([(False, 'F')] * 10000000)
print next((i for i, v in enumerate(best_case) if v[0] == True), None)

# 0
#          5 function calls in 0.316 seconds
#
#    Ordered by: standard name
#
#    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
#         1    0.316    0.316    0.316    0.316 so_lc.py:6(<module>)
#         2    0.000    0.000    0.000    0.000 so_lc.py:7(<genexpr>)
#         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
#         1    0.000    0.000    0.000    0.000 {next}

什么？！最好的情况吹走了列表推导，但我没想到我们最糟糕的情况会超出列表推导到这种程度。那个怎么样？坦率地说，我只能在没有进一步研究的情况下进行推测。

把这一切都拿出来，我没有在这里进行任何强大的分析，只是一些非常基本的测试。这应该足以理解生成器表达式对于这种类型的列表搜索更有效。

请注意，这是所有基本的内置python。我们不需要导入任何东西或使用任何库。

我第一次看到这种技术用Peter Norvig在Udacity cs212课程中进行搜索。

Answer 4

你的元组基本上是键值对 - 一个python dict - 所以：

l = [(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")]
val = dict(l)[53]

编辑 - 啊哈，你说你想要索引值为（53，“xuxa”）。如果这是真的你想要什么，你将不得不遍历原始列表，或者可能制作一个更复杂的字典：

d = dict((n,i) for (i,n) in enumerate(e[0] for e in l))
idx = d[53]

Answer 5

嗯......好吧，想到的简单方法就是将它转换成字典

d = dict(thelist)

并访问d[53]。

编辑：哎呀，第一次误读了你的问题。听起来你真的想要获得存储给定数字的索引。在这种情况下，请尝试

dict((t[0], i) for i, t in enumerate(thelist))

而不是普通的旧dict转化。那么d[53]就是2。

Answer 6

假设列表可能很长且数字可能重复，请考虑使用SortedList中的Python sortedcontainers module类型。 SortedList类型将按编号自动维护元组，并允许快速搜索。

例如：

from sortedcontainers import SortedList
sl = SortedList([(1,"juca"),(22,"james"),(53,"xuxa"),(44,"delicia")])

# Get the index of 53:

index = sl.bisect((53,))

# With the index, get the tuple:

tup = sl[index]

通过二进制搜索，这比列表理解建议快得多。字典建议会更快，但如果可能存在具有不同字符串的重复数字则无效。

如果有不同字符串的重复数字，那么您需要再采取一步：

end = sl.bisect((53 + 1,))

results = sl[index:end]

通过二等分54，我们将找到切片的结束索引。与接受的答案相比，这在长列表中会明显加快。

Answer 7

另一种方式。

zip(*a)[0].index(53)

Answer 8

[k代表k，v代表l如果v ==' delicia ']

这里是元组列表 - [（1，“juca”），（22，“james”），（53，“xuxa”），（44，“delicia”）]

而不是将其转换为词典，我们使用llist理解。

*Key* in Key,Value in list, where value = **delicia**

如何在Python中搜索元组列表

8 个答案:

TL;博士

解释

列表理解

“最坏情况”

“最佳案例”

生成器表达式

最坏情况

最佳案例