如何在列表中复制项目(保留第一个和最后一个项目)并将列表转换为两个项目的列表列表

时间:2017-11-28 23:14:12

标签: python python-3.x numpy

我有一个列表= [1,2,3,4,5,1],我想复制索引0到-1之间的项目。这样

duplicate_list = [1,2,2,3,3,4,4,5,5,1]

然后,将其转换为包含2个项目的列表列表,

list_lists = [(1,2),(2,3),(3,4),(4,5),(5,1)] 

怎么做?

6 个答案:

答案 0 :(得分:4)

使用numpy,您可以column_stack lst[:-1](删除了最后一个元素的原始列表)和lst[1:](删除了第一个元素的原始列表):

lst = [1,2,3,4,5,1]
np.column_stack((lst[:-1], lst[1:]))

#array([[1, 2],
#       [2, 3],
#       [3, 4],
#       [4, 5],
#       [5, 1]])

或普通列表,您可以使用zip

list(zip(lst[:-1], lst[1:]))
# [(1, 2), (2, 3), (3, 4), (4, 5), (5, 1)]

答案 1 :(得分:3)

使用标准zip()

lst = [1,2,3,4,5,1]

list(zip(lst, lst[1:]))

[(1, 2), (2, 3), (3, 4), (4, 5), (5, 1)]

答案 2 :(得分:2)

您可以使用滚动迭代器来跟踪最后一项:

def pairs(seq):
    it = iter(seq)
    last = next(it)
    for cur in it:
        yield last, cur
        last = cur

现在您可以将其用于迭代(for pair in pairs(lst))或将其转换为列表:

list(pairs(lst))

这样做的好处是它不会像你需要使用furas的zip解决方案那样复制列表。

答案 3 :(得分:2)

如果您从1d numpy数组开始并且只需要读取访问权限,则stride_tricks解决方案(例如,参见here)几乎是无与伦比的。

否则,@ furas'是最自然也最快的(见下面的基准)。仅对于非常大的列表,使用itertools.islice可以更快地获得,这可以避免复制原始列表。我应该在基准测试中误传任何一个道歉:

# n = 10
# using list
# stride_tricks         0.01128030 ms
# pp                    0.00127090 ms
# furas                 0.00111770 ms
# psidom                0.00730510 ms
# psidom_pp             0.00649300 ms
# kindall               0.00172260 ms
# script8man       apparently failed
# roadrunner            0.00346820 ms
# rr_pp                 0.00182190 ms
# using array
# stride_tricks         0.00890350 ms
# pp                    0.00270040 ms
# furas                 0.00259900 ms
# psidom                0.00391140 ms
# psidom_pp             0.00438800 ms
# kindall               0.00311760 ms
# script8man       apparently failed
# roadrunner            0.00957060 ms
# rr_pp                 0.00293320 ms
# n = 1000
# using list
# stride_tricks         0.05983050 ms
# pp                    0.03760700 ms
# furas                 0.03222870 ms
# psidom                0.11121020 ms
# psidom_pp             0.05617930 ms
# kindall               0.07354290 ms
# script8man       apparently failed
# roadrunner            0.27846060 ms
# rr_pp                 0.10267700 ms
# using array
# stride_tricks         0.00893700 ms
# pp                    0.08859840 ms
# furas                 0.08421750 ms
# psidom                0.00523970 ms
# psidom_pp             0.00569720 ms
# kindall               0.10453420 ms
# script8man       apparently failed
# roadrunner            0.94786410 ms
# rr_pp                 0.22243330 ms
# n = 1000000
# using list
# stride_tricks        52.28693480 ms
# pp                   70.97792920 ms
# furas                82.29811870 ms
# psidom              145.07117650 ms
# psidom_pp            59.57470910 ms
# kindall             107.59983590 ms
# script8man       apparently failed
# roadrunner          325.66514080 ms
# rr_pp               144.23583440 ms
# using array
# stride_tricks         0.01255540 ms
# pp                  143.99962610 ms
# furas               138.27061310 ms
# psidom                7.30384170 ms
# psidom_pp             7.42180100 ms
# kindall             148.59603090 ms
# script8man       apparently failed
# roadrunner         1030.40049850 ms
# rr_pp               260.31780930 ms

代码:

import numpy as np
from numpy.lib.stride_tricks import as_strided
import itertools as it

import types
from timeit import timeit

def setup_data(n):
    data = {'x': list(range(n))}
    return data

def f_stride_tricks(x):
    x = np.asanyarray(x).ravel()
    return as_strided(x, (x.size-1, 2), 2*x.strides)

def f_pp(x):
    return list(zip(x, it.islice(x, 1, None)))

def f_furas(x):
    return list(zip(x, x[1:]))

def f_psidom(x):
    return np.column_stack((x[:-1], x[1:]))

def f_psidom_pp(x):
    x = np.asanyarray(x).ravel()
    return np.column_stack((x[:-1], x[1:]))

def f_kindall(x):
    def pairs(seq):
        it = iter(seq)
        last = next(it)
        for cur in it:
            yield last, cur
            last = cur
    return list(pairs(x))

def f_script8man(x):
    temp = [x[0]]
    for n in x[1:-1]:
        temp.append(n)
        temp.append(n)
    temp.append(x[-1])
    final = []
    for e, n in enumerate(temp):
        if e != len(temp)-1:
            final.append((temp[e], temp[e+1]))
    return final

def f_roadrunner(x):
    return [tuple(x[i:i+2]) for i in range(0, len(x)-1)]

def f_rr_pp(x):
    return [(x[i], x[i+1]) for i in range(0, len(x)-1)]


for n in (10, 1000, 1000000):
    data = setup_data(n)
    ref = f_psidom(**data)
    print(f'n = {n}')
    for nmpy in range(2):
        print('using {}'.format(['list', 'array'][nmpy]))
        for name, func in list(globals().items()):
            if not name.startswith('f_') or not isinstance(func, types.FunctionType):
                continue
            try:
                assert np.allclose(ref, func(**data))
                print("{:16s}{:16.8f} ms".format(name[2:], timeit(
                    'f(**data)', globals={'f':func, 'data':data}, number=10)*100))
            except:
                print("{:16s} apparently failed".format(name[2:]))
        data['x'] = np.array(data['x'])

答案 4 :(得分:1)

使用此:

temp = [list[0]]
for n in list[1:-1]:
    temp.append(n)
    temp.append(n)
temp.append(list[-1])
final = []
for e, n in enumerate(temp):
    if e != len(temp)-1:
        final.append((temp[e], temp[e+1]))
# final is what you want

答案 5 :(得分:1)

您可以尝试使用itertools.repeat获取重复的项目,然后从该列表中切片[1:-1]

from itertools import repeat

lst = [1,2,3,4,5,1]

duplicate_list = [x for item in lst for x in repeat(item, 2)][1:-1]

>>> print(duplicate_lst)
[1, 2, 2, 3, 3, 4, 4, 5, 5, 1]

然后只需使用列表推导来获取元组列表:

lst_tuples = [tuple(duplicate_list[i:i+2]) for i in range(0, len(duplicate_list), 2)]

>>> print(lst_tuples)
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 1)]