我正在尝试进行更健壮的填充/大小标准化,以将列表列表转换为numpy数组格式。
(我知道关于“列表列表到numpy数组的列表”问题有很多问题,但是我还没有人尝试处理可变深度的列表。 因此,我正在使用递归来处理嵌套列表的未知深度。)
我有一个函数可以获取np.array所需的形状。
现在,当填充数组时,由于跟踪每个深度的索引,使用枚举进行递归已成为问题。
简单地说:
我需要一个递归函数,该递归函数的作用尚未确定:
# fill in the values of the ndarray `mat` with values from l
for row_ix, row in enumerate(l):
for col_ix, col in enumerate(row):
for val_ix, val in enumerate(col):
# ...
# ...
mat[row_ix, col_ix, val_ix] = l[row_ix][col_ix][val_ix]
其他详细信息
这是我的MCVE (minimal, complete and verifiable example),用于所需的输出/功能:
import numpy as np
# nested shapes of each list ( 2, [2, 4], [ [5,7], [3,7,4,9] ])
# desired shape (max of level) --> (2,4,9)
l = [[[1,2,5,6,7],
[0,2,5,34,5,6,7]],
[[5,6,7],
[0,2,5,7,34,5,7],
[0,5,6,7],
[1,2,3,4,5,6,7,8,9]]]
def nested_list_shape(lst):
# Provides the *maximum* length of each list at each depth of the nested list.
# (Designed to take uneven nested lists)
if not isinstance(lst[0], list):
return tuple([len(lst)])
return tuple([len(lst)]) + max([nested_list_shape(i) for i in lst])
shape = nested_list_shape(l) # (2,4,9)
mat = np.zeros(shape) # make numpy array of shape (2,4,9)
# fill in the values of the ndarray `mat` with values from l
for row_ix, row in enumerate(l):
for col_ix, col in enumerate(row):
for val_ix, val in enumerate(col):
mat[row_ix, col_ix, val_ix] = l[row_ix][col_ix][val_ix]
print(mat)
这是到目前为止我尝试过的事情:
import numpy as np
# nested shapes of each list ( 2, [2, 4], [ [5,7], [3,7,4,9] ])
# desired shape (max of level) --> (2,4,9)
l = [[[1,2,5,6,7],
[0,2,5,34,5,6,7]],
[[5,6,7],
[0,2,5,7,34,5,7],
[0,5,6,7],
[1,2,3,4,5,6,7,8,9]]]
def nested_list_shape(lst):
# Provides the *maximum* length of each list at each depth of the nested list.
# (Designed to take uneven nested lists)
if not isinstance(lst[0], list):
return tuple([len(lst)])
return tuple([len(lst)]) + max([nested_list_shape(i) for i in lst])
# Useful for setting values in nested list
def get_element(lst, idxs):
# l[x][y][z][t] <==> get_element(l, [x,y,z,t])
#
# Given a list (e.g. `l = [[2,3],[5,6,7]]`),
# index the elements with an list of indices (one value for each depth)
# (e.g. if `idxs = [1,1]` then get_element returns the equivalent of l[1][1])
if len(idxs) == 1:
return lst[idxs[0]]
else:
return get_element(lst[idxs[0]], idxs[1:])
# ::Problem function::
def fill_mat(lst):
# Create numpy array for list to fill
shape = nested_list_shape(lst)
depth = len(shape)
x = np.zeros(shape)
# Use list of indices to keep track of location within the nested enumerations
ixs = [0] * depth
# ::PROBLEM::
# Recursive setting of ndarray values with nested list values
# d = depth of recursion
# l = list at that depth of recursion
# lst = full nested list
def r(l, ixs, d = 0):
for ix, item in enumerate(l):
# Change list of indices to match the position
ixs[d] = ix
# if the item is a value, we reach the max depth
# so here we set the values
if not isinstance(item, list):
x[tuple(ixs)] = get_element(lst, ixs)
else:
# increase the depth if we see a nested list (but then how to decrease ... :/)
d += 1
# return statement should likely go here, and somehow return x
r(item, ixs, d)
return x # ?? bad use of recursion
return r(lst, ixs)
shape = nested_list_shape(l) # (2,4,9)
mat = np.zeros(shape) # make numpy array of shape (2,4,9)
# fill in the values of the ndarray `mat` with values from l
print(fill_mat(l))
get_element
函数使l[row_ix][col_ix][val_ix]
与ixs = [row_ix, col_ix, val_ix]
等效,但是跟踪每个函数仍然存在问题。
是否有人熟悉更简单的技术来递归处理这些索引?
答案 0 :(得分:0)
hpaulj向我指出了正确的方向-因此,我直接从here提出了代码,并提供了MCVE来解决此问题。
希望这对持续遇到此问题的所有人有所帮助。
import numpy as np
l = [[[1,2,3,4,5,6,7,8,9,10],
[0,2,5],
[6,7,8,9,10],
[0,2,5],
[0,2,5,6,7],
[0,2,5,34,5,6,7]],
[[2,5],
[6,7],
[7,8,9]],
[[2,5],
[6,7],
[7,8,9]],
[[2,5],
[6,7],
[7,8,9]]]
def nested_list_shape(lst):
# Provides the *maximum* length of each list at each depth of the nested list.
# (Designed to take uneven nested lists)
if not isinstance(lst[0], list):
return tuple([len(lst)])
return tuple([len(lst)]) + max([nested_list_shape(i) for i in lst])
def iterate_nested_array(array, index=()):
try:
for idx, row in enumerate(array):
yield from iterate_nested_array(row, (*index, idx))
except TypeError: # final level
for idx, item in enumerate(array):
yield (*index, idx), item
def pad(array, fill_value):
dimensions = nested_list_shape(array) #get_max_shape(array) # used my shape function, as it operated in 1/3 the time
result = np.full(dimensions, fill_value)
for index, value in iterate_nested_array(array):
result[index] = value
return result
print(pad(l, fill_value = 0))
。
output:
[[[ 1 2 3 4 5 6 7 8 9 10]
[ 0 2 5 0 0 0 0 0 0 0]
[ 6 7 8 9 10 0 0 0 0 0]
[ 0 2 5 0 0 0 0 0 0 0]
[ 0 2 5 6 7 0 0 0 0 0]
[ 0 2 5 34 5 6 7 0 0 0]]
[[ 2 5 0 0 0 0 0 0 0 0]
[ 6 7 0 0 0 0 0 0 0 0]
[ 7 8 9 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]]
[[ 2 5 0 0 0 0 0 0 0 0]
[ 6 7 0 0 0 0 0 0 0 0]
[ 7 8 9 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]]
[[ 2 5 0 0 0 0 0 0 0 0]
[ 6 7 0 0 0 0 0 0 0 0]
[ 7 8 9 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]
[ 0 0 0 0 0 0 0 0 0 0]]]