从平面列表填充嵌套索引列表的最“ pythonic”方式

时间:2019-02-05 09:46:47

标签: python-3.x

我遇到一种情况,我正在生成带有n个有组织元素的许多模板嵌套列表,其中模板中的每个数字对应于n个值的平面列表中的索引:

S =[[[2,4],[0,3]], [[1,5],[6,7]],[[10,9],[8,11],[13,12]]]

对于这些模板中的每个模板,其内部的值都对应于平面列表中的索引值,如下所示:

A = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n"]

获得;

B = [[["c","e"],["a","d"]], [["b","f"],["g","h"]],[["k","j"],["i","l"],["n","m"]]]

考虑到以下几点,如何使用列表A中的值填充结构S以获取B: -列表A的值可以更改,但不能更改数字 -模板可以具有的任何深度的嵌套结构,但只能使用A的索引一次,如上面的示例所示。

我使用下面非常丑陋的追加未展平函数来执行此操作,如果模板的深度不超过3个级别,该函数将起作用。有没有一种更好的方法可以使用生成器来完成它,并且可以使它适用于任意深度的模板。

我认为但无法实现的另一种解决方案是将模板设置为具有生成变量的字符串,然后使用eval()为变量分配新值

def unflatten(item, template):
    # works up to 3 levels of nested lists
    tree = []
    for el in template:
        if isinstance(el, collections.Iterable) and not isinstance(el, str):
            tree.append([])
            for j, el2 in enumerate(el):
                if isinstance(el2, collections.Iterable) and not isinstance(el2, str):
                    tree[-1].append([])
                    for k, el3 in enumerate(el2):
                        if isinstance(el3, collections.Iterable) and not isinstance(el3, str):
                            tree[-1][-1].append([])
                        else:
                            tree[-1][-1].append(item[el3])
                else:
                    tree[-1].append(item[el2])
        else:
            tree.append(item[el])
    return tree

当递归地执行上述操作且n = 100的有组织元素时,我需要一个更好的解决方案来实现此目的。

更新1

我正在使用的计时功能是这样的:

def timethis(func):
    '''
    Decorator that reports the execution time.
    '''
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        print(func.__name__, end-start)
        return result
    return wrapper

并且我将@DocDrivin建议的函数包装在另一个函数中,以单线调用它。下面是我的丑陋的附加函数。

@timethis
def unflatten(A, S):
    for i in range(100000):

        # making sure that you don't modify S
        rebuilt_list = copy.deepcopy(S)

        # create the mapping dict
        adict = {key: val for key, val in enumerate(A)}

        # the recursive worker function
        def worker(alist):

            for idx, entry in enumerate(alist):
                if isinstance(entry, list):
                    worker(entry)
                else:
                    # might be a good idea to catch key errors here
                    alist[idx] = adict[entry]

        #build list
        worker(rebuilt_list)

    return rebuilt_list

@timethis
def unflatten2(A, S):
    for i in range (100000):
        #up to level 3
        temp_tree = []
        for i, el in enumerate(S):
            if isinstance(el, collections.Iterable) and not isinstance(el, str):
                temp_tree.append([])
                for j, el2 in enumerate(el):
                    if isinstance(el2, collections.Iterable) and not isinstance(el2, str):
                        temp_tree[-1].append([])
                        for k, el3 in enumerate(el2):
                            if isinstance(el3, collections.Iterable) and not isinstance(el3, str):
                                temp_tree[-1][-1].append([])
                            else:
                                temp_tree[-1][-1].append(A[el3])
                    else:
                        temp_tree[-1].append(A[el2])
            else:
                temp_tree.append(A[el])
        return temp_tree

递归方法的语法要好得多,但是,比使用append方法要慢得多。

2 个答案:

答案 0 :(得分:3)

您可以通过使用递归来做到这一点:

import copy

S =[[[2,4],[0,3]], [[1,5],[6,7]],[[10,9],[8,11],[13,12]]]

A = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n"]

# making sure that you don't modify S
B = copy.deepcopy(S)

# create the mapping dict
adict = {key: val for key, val in enumerate(A)}

# the recursive worker function
def worker(alist):

    for idx, entry in enumerate(alist):
        if isinstance(entry, list):
            worker(entry)
        else:
            # might be a good idea to catch key errors here
            alist[idx] = adict[entry]

worker(B)
print(B)

这将为B产生以下输出:

[[['c', 'e'], ['a', 'd']], [['b', 'f'], ['g', 'h']], [['k', 'j'], ['i', 'l'], ['n', 'm']]]

我没有检查列表条目是否可以与dict实际映射,因此您可能要添加一个检查(在代码中标记为点)。

小修改:刚刚看到您想要的输出(可能是错字)。索引3映射到“ d”,而不是“ c”。您可能要对其进行编辑。

大修改:为了证明我的提议并不像乍看起来那样具有灾难性,我决定添加一些代码来测试其运行时。检查一下:

import timeit

setup1 = '''
import copy

S =[[[2,4],[0,3]], [[1,5],[6,7]],[[10,9],[8,11],[13,12]]]
A = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n"]
adict = {key: val for key, val in enumerate(A)}

# the recursive worker function

def worker(olist):

    alist = copy.deepcopy(olist)

    for idx, entry in enumerate(alist):
        if isinstance(entry, list):
            worker(entry)
        else:
            alist[idx] = adict[entry]

    return alist
'''

code1 = '''
worker(S)
'''

setup2 = '''
import collections

S =[[[2,4],[0,3]], [[1,5],[6,7]],[[10,9],[8,11],[13,12]]]
A = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n"]

def unflatten2(A, S):
    #up to level 3
    temp_tree = []
    for i, el in enumerate(S):
        if isinstance(el, collections.Iterable) and not isinstance(el, str):
            temp_tree.append([])
            for j, el2 in enumerate(el):
                if isinstance(el2, collections.Iterable) and not isinstance(el2, str):
                    temp_tree[-1].append([])
                    for k, el3 in enumerate(el2):
                        if isinstance(el3, collections.Iterable) and not isinstance(el3, str):
                            temp_tree[-1][-1].append([])
                        else:
                            temp_tree[-1][-1].append(A[el3])
                else:
                    temp_tree[-1].append(A[el2])
        else:
            temp_tree.append(A[el])
    return temp_tree
'''

code2 = '''
unflatten2(A, S)
'''

print(f'Recursive func: { [i/10000 for i in timeit.repeat(setup = setup1, stmt = code1, repeat = 3, number = 10000)] }')
print(f'Original func: { [i/10000 for i in timeit.repeat(setup = setup2, stmt = code2, repeat = 3, number = 10000)] }')

我正在使用timeit模块进行测试。运行此代码段时,您将获得类似于以下内容的输出:

Recursive func: [8.74395573977381e-05, 7.868373290111777e-05, 7.9051584698027e-05]
Original func: [3.548609419958666e-05, 3.537480780214537e-05, 3.501355930056888e-05]

这些是10000次迭代的平均时间,我决定运行3次以显示波动。如您所见,在这种情况下,我的功能比原始速度慢了2.22到2.50倍,但仍然可以接受。速度下降可能是由于使用deepcopy

您的测试存在一些缺陷,例如您可以在每次迭代时重新定义映射字典。您通常不会这样做,而是在定义一次之后将其作为函数的参数。

答案 1 :(得分:2)

您可以使用递归生成器

A = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n"]
S = [[[2,4],[0,3]], [[1,5],[6,7]],[[10,9],[8,11],[13,12]]]
A = {k: v for k, v in enumerate(A)}

def worker(alist):
    for e in alist:
        if isinstance(e, list):
            yield list(worker(e))
        else:
            yield A[e]

def do(alist):
    return list(worker(alist))

这也是一种递归方法,只需 避免单项分配 ,并让list通过读取您的发电机。如果需要,可以从@DocDriven的答案中Try it online!-setup1setup2复制(但我建议您不要用数字夸大,如果需要,可以在本地进行玩耍。)

以下是示例时间数字:

My result: [0.11194685893133283, 0.11086182110011578, 0.11299032904207706]
result1: [1.0810202199500054, 1.046933784848079, 0.9381260159425437]
result2: [0.23467918601818383, 0.236218704842031, 0.22498539905063808]