基于多个值过滤numpy结构化数组

时间:2015-11-16 19:10:07

标签: python arrays numpy structured-array

我有一个numpy结构化数组。 :

myArray = np.array([(1, 1, 1, u'Zone3', 9.223),
        (2, 1, 0, u'Zone2', 17.589),
        (3, 1, 1, u'Zone2', 26.95),
        (4, 0, 1, u'Zone1', 19.367),
        (5, 1, 1, u'Zone1', 4.395)],
         dtype=[('ID', '<i4'), ('Flag1', '<i4'), ('Flag2', '<i4'), ('ZoneName', '<U5'),
                ('Value', '<f8')])

我想总结来自&#34;价值&#34;满足多个条件时的列。如果我想要Flag1和Flag2 = = 1,我可以使用:

sumResult = (sum(myArray[((myArray["Flag1"] == 1) & (myArray["Flag2"] == 1))]["Value"]))

但是,我还想根据值是否在列表中包含第三个标准,相当于使用x in list

criteriaList = ("Zone1", "Zone2")
sumResult = (sum(myArray[((myArray["Flag1"] == 1) & (myArray["Flag2"] == 1) &
                (myArray["ZoneName"] in criteriaList))]["Value"]))

哪个应该等于31.345。我是numpy的新手并且已经探索了蒙面数组,但我不清楚它们是如何以及是否可以与结构化数组一起使用。感谢。

1 个答案:

答案 0 :(得分:1)

您需要使用np.in1d来测试criteriaList的成员资格:

In [1]: myArray["ZoneName"] in criteriaList
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-ff2173ff4348> in <module>()
----> 1 myArray["ZoneName"] in criteriaList

ValueError: The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()

In [2]: np.in1d(myArray["ZoneName"], criteriaList)
Out[2]: array([False,  True,  True,  True,  True], dtype=bool)

In [3]: myArray[(myArray["Flag1"] == 1) &
   ....:        (myArray["Flag2"] == 1) &
   ....:        np.in1d(myArray["ZoneName"], criteriaList)]["Value"].sum()
Out[3]: 31.344999999999999