Question

我已经在numpy's documentation之后创建了一个派生自numpy＆＃39; s ndarray的类，它看起来像（减少属性数量以使其更具可读性）：

import numpy as np

class Atom3D( np.ndarray ):
    __array_priority__ = 11.0

    def __new__( cls, idnum, coordinates):

        # Cast numpy to be our class type
        assert len(coordinates) == 3
        obj = np.asarray(coordinates, dtype= np.float64).view(cls)
        # add the new attribute to the created instance
        obj._number = int(idnum)
        # Finally, we must return the newly created object:
        return obj

    def __array_finalize__( self, obj ):
        self._number = getattr(obj, '_number', None)

    def __array_wrap__( self, out_arr, context=None ):
        return np.ndarray.__array_wrap__(self, out_arr, context)

    def __repr__( self ):
        return "{0._number}: ({0[0]:8.3f}, {0[1]:8.3f}, {0[2]:9.3f})".format(self)

当我执行测试时，我将numpy的ufunc应用于对象：

a1 = Atom3D(1, [5., 5., 5.])
print type(a1), repr(a1)
m  = np.identity(3)
a2 = np.dot(a1, m)
print type(a2), repr(a2)

我获得了预期的结果;也就是说，点函数保留对象的子类：

<class '__main__.Atom3D'> 1: (   5.000,    5.000,     5.000)  
<class '__main__.Atom3D'> 1: (   5.000,    5.000,     5.000)

但是，当我尝试将相同的np.dot应用于这些对象的数组时，子类就会丢失。因此，执行：

print "regular"
atom_list1 = [a1, a2, a3]
atom_list2 = np.dot(atom_list1, m)
for _ in atom_list2:
    print type(_), repr(_)

print "numpy array"
atom_list1 = np.array([a1, a2, a3], dtype=np.object)
atom_list2 = np.dot(atom_list1, m)
for _ in atom_list2:
    print type(_), repr(_)

给我这个：

regular
<type 'numpy.ndarray'> array([ 5.,  5.,  5.])
<type 'numpy.ndarray'> array([ 6.,  4.,  2.])
<type 'numpy.ndarray'> array([ 8.,  6.,  8.])
numpy array
<type 'numpy.ndarray'> array([5.0, 5.0, 5.0], dtype=object)
<type 'numpy.ndarray'> array([6.0, 4.0, 2.0], dtype=object)
<type 'numpy.ndarray'> array([8.0, 6.0, 8.0], dtype=object)

其他操作也是如此，例如__sub__：

print "regular"
a1 = Atom3D(1, [5., 5., 5.])
a2 = a1 - np.array([3., 2., 0.])
print type(a2), repr(a2)
print "numpy array"
a1 = Atom3D(1, [5., 5., 5.])
a2 = Atom3D(2, [6., 4., 2.])
a3 = Atom3D(3, [8., 6., 8.])
atom_list1 = np.array([a1, a2, a3], dtype=np.object)
atom_list2 = atom_list1 - np.array([3., 2., 0.])
for _ in atom_list2:
    print type(_), repr(_)

将屈服：

regular
<class '__main__.Atom3D'> 1: (   2.000,    3.000,     5.000)
numpy array
<type 'numpy.ndarray'> array([2.0, 3.0, 5.0], dtype=object)
<type 'numpy.ndarray'> array([3.0, 2.0, 2.0], dtype=object)
<type 'numpy.ndarray'> array([5.0, 4.0, 8.0], dtype=object)

我一直在寻找，但却没有发现我的错误谢谢！

Ĵ.-

Answer 1

没有dtype=Atom3D这样的事情。 dtype=list和dtype=np.ndarray也是如此。它创建一个dtype=object数组，其中每个元素都是指向内存中其他对象的指针。

使用np.array(...)创建对象数组可能会非常棘手。 np.array评估条目并做出一些自己的选择。最好的选择是，如果您想要对进入对象数组的元素进行绝对控制，那就是创建一个空白＆＃39;一，并自己分配元素。

In [508]: A=np.array([np.matrix([1,2]),np.matrix([2,1])],dtype=object)

In [509]: A       # a 3d array, no matrix subarrays
Out[509]: 
array([[[1, 2]],

       [[2, 1]]], dtype=object)

In [510]: A=np.empty((2,),dtype=object)

In [511]: A
Out[511]: array([None, None], dtype=object)

In [512]: A[:]=[np.matrix([1,2]),np.matrix([2,1])]

In [513]: A
Out[513]: array([matrix([[1, 2]]), matrix([[2, 1]])], dtype=object)

除非您真的需要对象数组进行重塑和转置，否则通常最好使用列表。

混合对象类型也有效：

In [522]: A=np.asarray([np.matrix([1,2]),np.ma.masked_array([2,1])],dtype=np.object)

In [523]: A
Out[523]: 
array([matrix([[1, 2]]),
       masked_array(data = [2 1],
             mask = False,
       fill_value = 999999)
], dtype=object)

==========================

执行np.dot([a1,a2,a3],m)时，它会先将所有列表转换为np.asarray([a1,a2,a3])的数组。结果是2d数组，而不是Atom3d个对象的数组。所以dot是通常的数组点。

如果我按照建议创建一个对象数组：

In [14]: A=np.empty((3,),dtype=object)
In [16]: A[:]=[a1,a2,a1+a2]

In [17]: A
Out[17]: 
array([1: (   5.000,    5.000,     5.000),
       1: (   5.000,    5.000,     5.000),
       1: (  10.000,   10.000,    10.000)], dtype=object)

In [18]: np.dot(A,m)
Out[18]: 
array([1: (   5.000,    5.000,     5.000),
       1: (   5.000,    5.000,     5.000),
       1: (  10.000,   10.000,    10.000)], dtype=object)

保留

Atom3D类型;

减法相同：

In [23]: A- np.array([3.,2., 0])
Out[23]: 
array([1: (   2.000,    2.000,     2.000),
       1: (   3.000,    3.000,     3.000),
       1: (  10.000,   10.000,    10.000)], dtype=object)

添加该数组和Atom3D有效，但显示结果有问题：

In [39]: x = A + a2

In [40]: x
Out[40]: <repr(<__main__.Atom3D at 0xb5062294>) failed: TypeError: non-empty format string passed to object.__format__>

使用对象dtype数组进行计算是不确定的。有些工作，显然是通过迭代数组的元素，应用函数，并将结果转回到对象数组中。实际上是

的数组版本

 [func(a, x) for a in A]

即使它有效，它也不会执行快速编译操作;它是迭代的（时间将类似于列表等价物。）

其他事情不起作用

In [41]: a1>0
Out[41]: 1: (   1.000,    1.000,     1.000)

In [42]: A>0
...
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

我们已经多次指出，对象dtype数组只不过是美化列表。元素是与列表一样的指针，因此操作将涉及迭代这些指针 - 在Python中，而不是C.这不是numpy代码的高度发达的角落。

应用ufunc后，在容器内维护numpy子类

1 个答案: