Python嵌套数组索引 - 意外行为

时间:2018-02-13 10:31:30

标签: python arrays numpy memory-management indexing

假设我们有一个输入数组,其中包含一些(但不是全部)nan值,我们希望将这些值写入一个纳米初始化的输出数组。在将非纳米数据写入输出数组后,仍然存在纳米值,我根本不理解 原因:

# minimal example just for testing purposes

import numpy as np

# fix state of seed
np.random.seed(1000)
# create input array and nan-filled output array
a = np.random.rand(6,3,5)
b = np.zeros((6,3,5)) * np.nan

x = [np.arange(6),1,2]
# select data in one dimension with others fixed
y_temp = a[x]
# set arbitrary index to nan
y_temp[1] = np.nan
ind_valid = ~np.isnan(y_temp)
# select non-nan values
y = y_temp[ind_valid]

# write input to output at corresponding indices
b[x][ind_valid] = y
print b[x][ind_valid]
# surprise, surprise :(
# [ nan  nan  nan  nan  nan  nan]

# workaround (that will of course cost computation time, even if not much)
c = np.zeros(len(y_temp)) * np.nan
c[ind_valid] = y
b[x] = c
print b[x][ind_valid]
# and this is what we want to have
# [ 0.39719446         nan  0.39820488  0.68190824  0.86534558  0.69910395]

我认为数组b会在内存中保留一些块,并通过x索引它"知道"那些指数。然后,当使用ind_valid仅选择其中一些时,它也应该知道它们,并且能够准确地写入内存中的那些位地址。不知道,但也许它是......与python nested list unexpected behaviour类似?请解释,也许还提供一个很好的解决方案,而不是建议的解决方法!谢谢!

0 个答案:

没有答案