Question

我有一个句子作为单词列表，我试图从中提取所有的双字母（即所有连续的2元组单词）。所以，如果我的判决是

['To', 'sleep', 'perchance', 'to', 'dream']

我想退出

[('To', 'sleep'), ('sleep', 'perchance'), ('perchance', 'to'), ('to', 'dream')]

目前，我正在使用

zip([sentence[i] for i in range(len(sentence) - 1)], [sentence[i+1] for i in range(len(sentence) - 1)]然后对此进行迭代，但我不禁想到有更多的Pythonic方法可以做到这一点。

Answer 1

您使用zip走在正确的轨道上。我建议使用列表切片而不是理解。

seq = ['To', 'sleep', 'perchance', 'to', 'dream']
print zip(seq, seq[1:])

结果：

[('To', 'sleep'), ('sleep', 'perchance'), ('perchance', 'to'), ('to', 'dream')]

请注意，zip的参数不必相同，因此seq的长度超过seq[1:]。

Answer 2

这是我之前准备的。它来自官方python文档中的itertools recipes section。

from itertools import tee, izip

def pairwise(iterable):
    """Iterate in pairs

    >>> list(pairwise([0, 1, 2, 3]))
    [(0, 1), (1, 2), (2, 3)]
    >>> tuple(pairwise([])) == tuple(pairwise('x')) == ()
    True
    """
    a, b = tee(iterable)
    next(b, None)
    return izip(a, b)

Answer 3

同样的想法，但使用切片而不是使用range

进行索引

>>> l =['To', 'sleep', 'perchance', 'to', 'dream']
>>> list(zip(l, l[1:]))
[('To', 'sleep'), ('sleep', 'perchance'), ('perchance', 'to'), ('to', 'dream')]

从列表中获取所有连续2元组的Pythonic方法

3 个答案: