枚举的递归:将[列表的]不均匀嵌套列表的大小填充/规范化为numpy数组-具有回溯性

时间:2019-06-22 19:53:47

标签: python arrays list numpy recursion

我正在尝试进行更健壮的填充/大小标准化,以将列表列表转换为numpy数组格式。

(我知道关于“列表列表到numpy数组的列表”问题有很多问题,但是我还没有人尝试处理可变深度的列表。 因此,我正在使用递归来处理嵌套列表的未知深度。)

我有一个函数可以获取np.array所需的形状。

现在,当填充数组时,由于跟踪每个深度的索引,使用枚举进行递归已成为问题。

简单地说:

我需要一个递归函数,该递归函数的作用尚未确定:

# fill in the values of the ndarray `mat` with values from l
for row_ix, row in enumerate(l):
    for col_ix, col in enumerate(row):
        for val_ix, val in enumerate(col):
                   # ...
                   # ...
            mat[row_ix, col_ix, val_ix] = l[row_ix][col_ix][val_ix]

其他详细信息


这是我的MCVE (minimal, complete and verifiable example),用于所需的输出/功能:

import numpy as np

# nested shapes of each list ( 2, [2, 4], [ [5,7], [3,7,4,9] ])
# desired shape (max of level) --> (2,4,9)
l = [[[1,2,5,6,7],
       [0,2,5,34,5,6,7]],
      [[5,6,7],
       [0,2,5,7,34,5,7],
       [0,5,6,7],
       [1,2,3,4,5,6,7,8,9]]]

def nested_list_shape(lst):
#     Provides the *maximum* length of each list at each depth of the nested list.
#     (Designed to take uneven nested lists)
    if not isinstance(lst[0], list):
        return tuple([len(lst)])
    return tuple([len(lst)]) + max([nested_list_shape(i) for i in lst])

shape = nested_list_shape(l) # (2,4,9)
mat = np.zeros(shape) # make numpy array of shape (2,4,9)

# fill in the values of the ndarray `mat` with values from l
for row_ix, row in enumerate(l):
    for col_ix, col in enumerate(row):
        for val_ix, val in enumerate(col):
            mat[row_ix, col_ix, val_ix] = l[row_ix][col_ix][val_ix]
print(mat)

这是到目前为止我尝试过的事情:

import numpy as np

# nested shapes of each list ( 2, [2, 4], [ [5,7], [3,7,4,9] ])
# desired shape (max of level) --> (2,4,9)
l = [[[1,2,5,6,7],
       [0,2,5,34,5,6,7]],
      [[5,6,7],
       [0,2,5,7,34,5,7],
       [0,5,6,7],
       [1,2,3,4,5,6,7,8,9]]]

def nested_list_shape(lst):
#     Provides the *maximum* length of each list at each depth of the nested list.
#     (Designed to take uneven nested lists)
    if not isinstance(lst[0], list):
        return tuple([len(lst)])
    return tuple([len(lst)]) + max([nested_list_shape(i) for i in lst])

# Useful for setting values in nested list
def get_element(lst, idxs):
#     l[x][y][z][t] <==> get_element(l, [x,y,z,t])
#
#     Given a list (e.g. `l = [[2,3],[5,6,7]]`),
#         index the elements with an list of indices (one value for each depth)
#         (e.g. if `idxs = [1,1]` then get_element returns the equivalent of l[1][1])
    if len(idxs) == 1:
        return lst[idxs[0]]
    else:
        return get_element(lst[idxs[0]], idxs[1:])

# ::Problem function::
def fill_mat(lst):
    # Create numpy array for list to fill
    shape = nested_list_shape(lst)
    depth = len(shape)
    x = np.zeros(shape)

    # Use list of indices to keep track of location within the nested enumerations
    ixs = [0] * depth

    # ::PROBLEM::
    # Recursive setting of ndarray values with nested list values
    # d = depth of recursion
    # l = list at that depth of recursion
    # lst = full nested list
    def r(l, ixs, d = 0):
        for ix, item in enumerate(l):
            # Change list of indices to match the position
            ixs[d] = ix
            # if the item is a value, we reach the max depth
            # so here we set the values
            if not isinstance(item, list):
                x[tuple(ixs)] = get_element(lst, ixs)
            else:
                # increase the depth if we see a nested list (but then how to decrease ... :/)
                d += 1
                # return statement should likely go here, and somehow return x
                r(item, ixs, d)
        return x # ?? bad use of recursion
    return r(lst, ixs)

shape = nested_list_shape(l) # (2,4,9)
mat = np.zeros(shape) # make numpy array of shape (2,4,9)

# fill in the values of the ndarray `mat` with values from l
print(fill_mat(l))

get_element函数使l[row_ix][col_ix][val_ix]ixs = [row_ix, col_ix, val_ix]等效,但是跟踪每个函数仍然存在问题。

是否有人熟悉更简单的技术来递归处理这些索引?

1 个答案:

答案 0 :(得分:0)

hpaulj向我指出了正确的方向-因此,我直接从here提出了代码,并提供了MCVE来解决此问题。
希望这对持续遇到此问题的所有人有所帮助。

import numpy as np
l = [[[1,2,3,4,5,6,7,8,9,10],
       [0,2,5],
       [6,7,8,9,10],
       [0,2,5],
       [0,2,5,6,7],
       [0,2,5,34,5,6,7]],
      [[2,5],
       [6,7],
       [7,8,9]],
      [[2,5],
       [6,7],
       [7,8,9]],
      [[2,5],
       [6,7],
       [7,8,9]]]

def nested_list_shape(lst):
#     Provides the *maximum* length of each list at each depth of the nested list.
#     (Designed to take uneven nested lists)
    if not isinstance(lst[0], list):
        return tuple([len(lst)])
    return tuple([len(lst)]) + max([nested_list_shape(i) for i in lst])

def iterate_nested_array(array, index=()):
    try:
        for idx, row in enumerate(array):
            yield from iterate_nested_array(row, (*index, idx))
    except TypeError: # final level
        for idx, item in enumerate(array):
            yield (*index, idx), item

def pad(array, fill_value):
    dimensions = nested_list_shape(array) #get_max_shape(array) # used my shape function, as it operated in 1/3 the time
    result = np.full(dimensions, fill_value)
    for index, value in iterate_nested_array(array):
        result[index] = value
    return result

print(pad(l, fill_value = 0))

output:
[[[ 1  2  3  4  5  6  7  8  9 10]

  [ 0  2  5  0  0  0  0  0  0  0]
  [ 6  7  8  9 10  0  0  0  0  0]
  [ 0  2  5  0  0  0  0  0  0  0]
  [ 0  2  5  6  7  0  0  0  0  0]
  [ 0  2  5 34  5  6  7  0  0  0]]

 [[ 2  5  0  0  0  0  0  0  0  0]
  [ 6  7  0  0  0  0  0  0  0  0]
  [ 7  8  9  0  0  0  0  0  0  0]
  [ 0  0  0  0  0  0  0  0  0  0]
  [ 0  0  0  0  0  0  0  0  0  0]
  [ 0  0  0  0  0  0  0  0  0  0]]

 [[ 2  5  0  0  0  0  0  0  0  0]
  [ 6  7  0  0  0  0  0  0  0  0]
  [ 7  8  9  0  0  0  0  0  0  0]
  [ 0  0  0  0  0  0  0  0  0  0]
  [ 0  0  0  0  0  0  0  0  0  0]
  [ 0  0  0  0  0  0  0  0  0  0]]

 [[ 2  5  0  0  0  0  0  0  0  0]
  [ 6  7  0  0  0  0  0  0  0  0]
  [ 7  8  9  0  0  0  0  0  0  0]
  [ 0  0  0  0  0  0  0  0  0  0]
  [ 0  0  0  0  0  0  0  0  0  0]
  [ 0  0  0  0  0  0  0  0  0  0]]]