与花式索引混淆(对于非花哨的人)

时间:2016-08-31 14:28:52

标签: arrays python-2.7 numpy indexing slice

让我们假设一个多维数组

import numpy as np
foo = np.random.rand(102,43,35,51)

我知道那些最后的维度代表了一个2D空间(35,51),我想将 索引 假设我想要第0​​列的第8到第30行 根据我对索引的理解,我应该调用

foo[0][0][8::30][0]

知道我的数据(与此处使用的随机数据不同),这不是我的预期

我可以试试这个确实有效,但看起来很荒谬

foo[0][0][[8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30],0]

现在我可以在this documentation 找到我可以使用的内容 类似的东西:

foo[0][0][[8,30],0]

只给出了第8行和第30行的值 虽然这个:

foo[0][0][[8::30],0]

给出错误

File "<ipython-input-568-cc49fe1424d1>", line 1
foo[0][0][[8::30],0]
                ^
SyntaxError: invalid syntax

我不明白为什么::参数不能在这里传递。那么在索引语法中指示范围的方法是什么?

所以我想我的整体问题是这种语法的正确pythonic等价物:

foo[0][0][[8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30],0]

1 个答案:

答案 0 :(得分:3)

Instead of

foo[0][0][8::30][0]

try

foo[0, 0, 8:30, 0]

The foo[0][0] part is the same as foo[0, 0, :, :], selecting a 2d array (35 x 51). But foo[0][0][8::30] selects a subset of those rows

Consider what happens when is use 0::30 on 2d array:

In [490]: np.zeros((35,51))[0::30].shape
Out[490]: (2, 51)

In [491]: np.arange(35)[0::30]
Out[491]: array([ 0, 30])

The 30 is the step, not the stop value of the slice.

the last [0] then picks the first of those rows. The end result is the same as foo[0,0,0,:].

It is better, in most cases, to index multiple dimensions with the comma syntax. And if you want the first 30 rows use 0:30, not 0::30 (that's basic slicing notation, applicable to lists as well as arrays).

As for:

foo[0][0][[8::30],0]

simplify it to x[[8::30], 0]. The Python interpreter accepts [1:2:3, 0], translating it to tuple(slice(1,2,3), 0) and passing it to a __getitem__ method. But the colon syntax is accepted in a very specific context. The interpreter is treating that inner set of brackets as a list, and colons are not accepted there.

foo[0,0,[1,2,3],0]

is ok, because the inner brackets are a list, and the numpy getitem can handle those.

numpy has a tool for converting a slice notation into a list of numbers. Play with that if it is still confusing:

In [495]: np.r_[8:30]
Out[495]: 
array([ 8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
       25, 26, 27, 28, 29])
In [496]: np.r_[8::30]
Out[496]: array([0])
In [497]: np.r_[8:30:2]
Out[497]: array([ 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28])