numpy中的对象数组?

时间:2017-01-05 10:52:10

标签: arrays sorting object numpy

如何在Numpy中的两个或多个属性上有效地对对象数组进行排序?

class Obj():
    def __init__(self,a,b):
        self.a = a
        self.b = b

arr = np.array([],dtype=Obj)        

for i in range(10):
    arr = np.append(arr,Obj(i, 10-i))

arr_sort = np.sort(arr, order=a,b) ???

Thx,Willem-Jan

1 个答案:

答案 0 :(得分:0)

ax.legend(polys, labels) 参数仅适用于结构化数组:

order

使用二维数组,In [383]: arr=np.zeros((10,),dtype='i,i') In [385]: for i in range(10): ...: arr[i] = (i,10-i) In [386]: arr Out[386]: array([(0, 10), (1, 9), (2, 8), (3, 7), (4, 6), (5, 5), (6, 4), (7, 3), (8, 2), (9, 1)], dtype=[('f0', '<i4'), ('f1', '<i4')]) In [387]: np.sort(arr, order=['f0','f1']) Out[387]: array([(0, 10), (1, 9), (2, 8), (3, 7), (4, 6), (5, 5), (6, 4), (7, 3), (8, 2), (9, 1)], dtype=[('f0', '<i4'), ('f1', '<i4')]) In [388]: np.sort(arr, order=['f1','f0']) Out[388]: array([(9, 1), (8, 2), (7, 3), (6, 4), (5, 5), (4, 6), (3, 7), (2, 8), (1, 9), (0, 10)], dtype=[('f0', '<i4'), ('f1', '<i4')]) 提供类似的“有序”数据。排序

lexsort

使用对象数组,我可以将属性提取到以下任一结构中:

In [402]: arr=np.column_stack((np.arange(10),10-np.arange(10)))
In [403]: np.lexsort((arr[:,1],arr[:,0]))
Out[403]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)
In [404]: np.lexsort((arr[:,0],arr[:,1]))
Out[404]: array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0], dtype=int32)

Python In [407]: np.array([(a.a, a.b) for a in arr]) Out[407]: array([[ 0, 10], [ 1, 9], [ 2, 8], .... [ 7, 3], [ 8, 2], [ 9, 1]]) In [408]: np.array([(a.a, a.b) for a in arr],dtype='i,i') Out[408]: array([(0, 10), (1, 9), (2, 8), (3, 7), (4, 6), (5, 5), (6, 4), (7, 3), (8, 2), (9, 1)], dtype=[('f0', '<i4'), ('f1', '<i4')]) 函数适用于sorted(或其等效列表)

arr

您的In [421]: arr Out[421]: array([<__main__.Obj object at 0xb0f2d24c>, <__main__.Obj object at 0xb0f2dc0c>, .... <__main__.Obj object at 0xb0f35ecc>], dtype=object) In [422]: sorted(arr, key=lambda a: (a.b,a.a)) Out[422]: [<__main__.Obj at 0xb0f35ecc>, <__main__.Obj at 0xb0f3570c>, ... <__main__.Obj at 0xb0f2dc0c>, <__main__.Obj at 0xb0f2d24c>] 课程缺少一个不错的Obj方法。我必须使用__str__之类的内容来查看[(i.a, i.b) for i in arr]元素的值。

正如我在评论中所说,对于这个例子,列表比对象数组好得多。

arr

list In [423]: alist=[] In [424]: for i in range(10): ...: alist.append(Obj(i,10-i)) 比重复数组追加更快。与列表相比,对象数组不会添加更多功能,尤其是在1d时,对象是这样的自定义类。你不能在append上做任何数学运算,正如你所看到的,排序并不容易。