Question

我执行以下操作：

from numpy import genfromtxt

x = genfromtxt('foo.csv',delimiter=',',usecols=(0,1))

y = genfromtxt('foo.csv',delimiter=',',usecols=(2),dtype=str)

然后我输入：

x[y=='y1Out',0] # assume the set of "y" is 'y1Out' and 'y2Out'

该命令打印“x”中具有相关“y”值等于y1Out的所有“0列”值。这怎么可能？也就是说，numpy如何跟踪“x”和“y”之间的对齐？我认为numpy没有数据对齐。

Answer 1

执行y == 'y10ut'并且y是dtype字符串数组时，numpy返回一个布尔数组，其索引为y，满足条件。 E.g：

import numpy as np
y = np.empty(10, dtype='S8')
# populating the array with 'y10ut' and 'y20ut' alternatively
y[1::2] = 'y10ut'
y[::2] = 'y20ut'

然后你可以评估条件：

>>> y == 'y10ut'
array([False,  True, False,  True, False,  True, False,
       True, False,  True], dtype=bool)

此结果数组可用作x的索引数组。请注意，如果y不是字符串数组，则结果评估不再是索引数组：

>>> y = np.arange(5, dtype='f')
>>> y == 'y10ut'
False

在你的情况下，numpy并不知道x和y之间的关系。但是根据条件y == 'y10ut'，它会根据它来索引x的第一个维度，这似乎是你想要的。