将3D numpy数组转换为3个索引的列表

时间:2019-12-05 10:59:31

标签: arrays python-3.x numpy

所以我有一个大型3D数据矩阵,例如 10000X10000X1000 ,现在我要做的是遍历3D数据矩阵的每个元素,并将索引和2的值写入文件具有相同大小的不同矩阵,例如一行:

i j k val1 val2

我目前的工作是在3个嵌套循环中运行,并按以下方式打印它,例如2个小型3D数据矩阵的示例和方法:

import numpy as np


vv1= np.array([[[1,2,3],[2,3,4],[3,4,5]],
                [[4,5,6],[5,6,7],[6,7,8]],
                [[7,8,9],[8,9,10],[9,10,11]]])

vv2= np.array([[[1,2,3],[2,3,4],[3,4,5]],
                [[4,5,6],[5,6,7],[6,7,8]],
                [[7,8,9],[8,9,10],[9,10,11]]])

for x in range(vv1.shape[0]):
    for y in range(vv1.shape[1]):
        for z in range(vv1.shape[2]):
            print("{:} {:} {:} {:} {:}".format(x,y,z,vv1[x,y,z], vv2[x,y,z]))

这个简单的代码可以完成工作,但是很慢。

我想到的另一种方法是创建一个一维长向量,每个条目将是3个索引值,然后对打印应用相同的逻辑,例如嵌套循环示例:

vv_ind = []

for x in range(vv1.shape[0]):
    for y in range(vv1.shape[1]):
        for z in range(vv1.shape[2]):
            vv_ind.append([x,y,z])

for elem in vv_ind:
    i = tuple(elem)
    print("{:} {:} {:} {:} {:}".format(*elem, vv1[i], vv2[i]))

给出所需的输出。

我的问题如下:

  1. 还有更多的“ pythonic” 方法来创建该索引列表吗?
  2. 关于最后的打印循环:

    for elem in vv_ind:
        i = tuple(elem)
        print("{:} {:} {:} {:} {:}".format(*elem, vv1[i], vv2[i]))
    

    有更有效的方法吗?

同样,这里给出的数组只是虚设的

不胜感激

3 个答案:

答案 0 :(得分:2)

您可以使用np.mgrid来生成索引,并且如果您不介意将所有内容保存为相同的数据类型,则可以将数组堆叠在一起并通过np.save或{{ 1}}:

np.savetxt

否则,您还可以使用np.ndindex遍历数组索引:

In [1]: import numpy as np                                                                    

In [2]: a = np.random.randint(0, 255, size=(4, 4, 4))                                         

In [3]: b = np.random.randint(0, 255, size=(4, 4, 4))                                         

In [4]: data = np.stack([x.ravel() for x in np.mgrid[:4, :4, :4]] + [a.ravel(), b.ravel()], axis=1)                                                                                 

In [5]: np.save('/tmp/test.npy', data)                                                        

In [6]: data                                                                                  
Out[6]: 
array([[  0,   0,   0, 169,  35],
       [  0,   0,   1,  14, 120],
       [  0,   0,   2,  93, 207],
       [  0,   0,   3,  70, 158],
       [  0,   1,   0, 115,  52],
       [  0,   1,   1,  10, 248],
       [  0,   1,   2,   5, 123],
       [  0,   1,   3, 125, 143],
       [  0,   2,   0,  73, 241],
       [  0,   2,   1,  25, 118],
       [  0,   2,   2, 240, 159],
       [  0,   2,   3,  60, 179],
       [  0,   3,   0,  29, 221],
       [  0,   3,   1, 214,  33],
       [  0,   3,   2, 145,  60],
       [  0,   3,   3, 207,  74],
       [  1,   0,   0,   7,  37],
       [  1,   0,   1, 146, 192],
       [  1,   0,   2, 227,  83],
       [  1,   0,   3, 247,  51],
       [  1,   1,   0, 253,  18],
       [  1,   1,   1, 188,   2],
       [  1,   1,   2, 164, 252],
       [  1,   1,   3, 192, 229],
       [  1,   2,   0,  18, 236],
       [  1,   2,   1,  85,  48],
       [  1,   2,   2,  20, 233],
       [  1,   2,   3,  81, 152],
       [  1,   3,   0, 122,  30],
       [  1,   3,   1, 227, 221],
       [  1,   3,   2,  11, 247],
       [  1,   3,   3,  84, 203],
       [  2,   0,   0,   5,  94],
       [  2,   0,   1, 174, 179],
       [  2,   0,   2, 224, 222],
       [  2,   0,   3, 168,  40],
       [  2,   1,   0, 160, 136],
       [  2,   1,   1,  16, 121],
       [  2,   1,   2, 237, 241],
       [  2,   1,   3,  70,  29],
       [  2,   2,   0, 127, 188],
       [  2,   2,   1,  33,  67],
       [  2,   2,   2,   4, 138],
       [  2,   2,   3, 153, 114],
       [  2,   3,   0, 162,   8],
       [  2,   3,   1, 254,  91],
       [  2,   3,   2, 153,  69],
       [  2,   3,   3, 167,  33],
       [  3,   0,   0,  99, 101],
       [  3,   0,   1,  26,   2],
       [  3,   0,   2, 162, 131],
       [  3,   0,   3,  23,  97],
       [  3,   1,   0, 226,  37],
       [  3,   1,   1,   5, 130],
       [  3,   1,   2, 215, 164],
       [  3,   1,   3, 247,  95],
       [  3,   2,   0, 138,  49],
       [  3,   2,   1, 248, 175],
       [  3,   2,   2, 134,  39],
       [  3,   2,   3, 170,  67],
       [  3,   3,   0,   1, 177],
       [  3,   3,   1, 245,  31],
       [  3,   3,   2,  71, 160],
       [  3,   3,   3,  81,   9]])

答案 1 :(得分:1)

要创建索引列表,可以使用函数product

from itertools import product

product(*3 * [range(3)]) # generator of indices

product(range(3), range(3), range(3))

from itertools import product, repeat

product(*repeat(range(3), 3))

您可以简化代码:

from itertools import product, repeat

for idx in product(*repeat(range(3), 3)):
    print(*idx, vv1[idx], vv2[idx])

正如评论中提到的@a_guest一样,我们可以使用np.ndindex(*vv1.shape)代替product(*repeat(range(3), 3))

答案 2 :(得分:1)

如果数据不是整数,则可以使用结构化数组使用np.savetxt进行操作:

import numpy as np
import io

# Data
vv1 = np.array([[[  1,  2,  3], [  2,  3,  4],[  3,  4,  5]],
                [[  4,  5,  6], [  5,  6,  7],[  6,  7,  8]],
                [[  7,  8,  9], [  8,  9, 10],[  9, 10, 11]]], np.float32)
vv2 = np.array([[[  1,  2,  3], [  2,  3,  4],[  3,  4,  5]],
                [[  4,  5,  6], [  5,  6,  7],[  6,  7,  8]],
                [[  7,  8,  9], [  8,  9, 10],[  9, 10, 11]]], np.float32)

xx, yy, zz = np.meshgrid(*map(range, vv1.shape), indexing='ij')
# Structured array of indices and data
a = np.empty(len(idx), dtype='i,i,i,f,f')
a['f0'] = xx.ravel()
a['f1'] = yy.ravel()
a['f2'] = zz.ravel()
a['f3'] = vv1.ravel()
a['f4'] = vv2.ravel()
# Using StringIO here to show result, normally would use a file object or file name
s = io.StringIO()
np.savetxt(s, a, fmt='%d %d %d %.3f %.3f')
print(s.getvalue())

输出:

0 0 0 1.000 1.000
0 0 1 2.000 2.000
0 0 2 3.000 3.000
0 1 0 2.000 2.000
0 1 1 3.000 3.000
0 1 2 4.000 4.000
0 2 0 3.000 3.000
0 2 1 4.000 4.000
0 2 2 5.000 5.000
1 0 0 4.000 4.000
1 0 1 5.000 5.000
1 0 2 6.000 6.000
1 1 0 5.000 5.000
1 1 1 6.000 6.000
1 1 2 7.000 7.000
1 2 0 6.000 6.000
1 2 1 7.000 7.000
1 2 2 8.000 8.000
2 0 0 7.000 7.000
2 0 1 8.000 8.000
2 0 2 9.000 9.000
2 1 0 8.000 8.000
2 1 1 9.000 9.000
2 1 2 10.000 10.000
2 2 0 9.000 9.000
2 2 1 10.000 10.000
2 2 2 11.000 11.000

np.savetxt实际上只是在内部循环遍历数据,因此,它并不是像魔术般更快。可能不值得为此创建额外的大型数组。