Question

在我的previous question中，我学会了调整子类ndarray的大小。整齐。不幸的是，当我尝试调整大小的数组是计算结果时，它不再有效：

import numpy as np

class Foo(np.ndarray):
    def __new__(cls,shape,dtype=np.float32,buffer=None,offset=0,
                strides=None,order=None):
        return np.ndarray.__new__(cls,shape,dtype,buffer,offset,strides,order)

    def __array_prepare__(self,output,context):
        print output.flags['OWNDATA'],"PREPARE",type(output)
        return np.ndarray.__array_prepare__(self,output,context)

    def __array_wrap__(self,output,context=None):
        print output.flags['OWNDATA'],"WRAP",type(output)

        return np.ndarray.__array_wrap__(self,output,context)

a = Foo((32,))
#resizing a is no problem
a.resize((24,),refcheck=False)

b = Foo((32,))
c = Foo((32,))

d = b+c
#Cannot resize `d`
d.resize((24,),refcheck=False)

确切的输出（包括追溯）是：

True PREPARE <type 'numpy.ndarray'>
False WRAP <class '__main__.Foo'>
Traceback (most recent call last):
  File "test.py", line 26, in <module>
    d.resize((24,),refcheck=False)
ValueError: cannot resize this array: it does not own its data

我认为这是因为numpy创建了一个新的ndarray并将其传递给__array_prepare__。在此过程中的某个时刻，似乎是“output” 数组获得view-casted to my Foo type，尽管文档在这一点上似乎没有100％清晰/准确。无论如何，在视图转换之后，输出不再拥有数据，因此无法重新整形（据我所知）。

有没有办法，通过某种numpy伏都教（__array_prepare__，__array__）等将数据的所有权转移到我的子类的实例？

Answer 1

这不是一个令人满意的答案，但它也不适合评论......您可以使用ufunc的out参数解决数据拥有问题。一个愚蠢的例子：

>>> a = Foo((5,))
>>> b = Foo((5,))
>>> c = a + b # BAD
True PREPARE <type 'numpy.ndarray'>
False WRAP <class '__main__.Foo'>
>>> c.flags.owndata
False

>>> c = Foo((5,))
>>> c[:] = a + b # BETTER
True PREPARE <type 'numpy.ndarray'>
False WRAP <class '__main__.Foo'>
>>> c.flags.owndata
True

>>> np.add(a, b, out=c) # BEST
True PREPARE <class '__main__.Foo'>
True WRAP <class '__main__.Foo'>
Foo([  1.37754085e-38,   1.68450356e-20,   6.91042737e-37,
         1.74735556e-04,   1.48018885e+29], dtype=float32)
>>> c.flags.owndata
True

我认为上面的输出与c[:] = a + b获取拥有数据是一致的，代价是从临时数组中将其复制到c。但是，当您使用out参数时，不应该发生这种情况。

由于您已经在担心数学表达式中的中间存储，因此微观管理它的处理方式可能并不是一件坏事。也就是说，替换

g = a + b + np.sqrt(d*d + e*e + f*f)

与

g = foo_like(d) # you'll need to write this function!
np.multiply(d, d, out=g)
g += e * e
g += f * f
np.sqrt(g, out=g)
g += b
g += a

可以节省一些中间内存，它可以让你拥有自己的数据。它确实抛出了“可读性计数”的口号，但是......

Answer 2

在此过程中的某个时刻，似乎是“输出”数组获取视图转换为我的Foo类型

是的，ndarray.__array_prepare__调用output.view，它返回一个不拥有其数据的数组。

我进行了一些实验，找不到一个简单的方法。

虽然我同意这种行为并不理想，至少在您的使用案例中，我认为d不接受其数据是可以接受的。 Numpy广泛使用视图，如果你坚持避免在使用numpy数组时创建任何视图，那么你的生活将非常艰难。

我还要声称，根据我的经验，resize通常应该避免。如果避免resize，则在创建视图时不应该有任何问题。它有一种hacky的感觉，而且很难使用（正如你可能已经开始理解的那样，在使用它时遇到了两个经典错误之一：it does not own its data。另一个是cannot resize an array that has been referenced）。（此quesion中描述了另一个问题。）

由于您决定使用resize来自对其他问题的回答，我会发布其余的答案there。

Answer 3

怎么样：

def resize(arr, shape):
    np.require(arr, requirements=['OWNDATA'])
    arr.resize(shape, refcheck=False)

似乎成功调整大小（并减少内存消耗）：

import array
import numpy as np
import time

class Foo(np.ndarray):
    def __new__(cls, shape, dtype=np.float32, buffer=None, offset=0,
                strides=None, order=None):
        return np.ndarray.__new__(cls, shape, dtype, buffer, offset, strides, order)

    def __array_prepare__(self, output, context):
        print(output.flags['OWNDATA'], "PREPARE", type(output))
        return np.ndarray.__array_prepare__(self, output, context)

    def __array_wrap__(self, output, context=None):
        print(output.flags['OWNDATA'], "WRAP", type(output))
        output = np.ndarray.__array_wrap__(self, output, context)
        return output

def free_memory():
    """
    Return free memory available, including buffer and cached memory
    """
    total = 0
    with open('/proc/meminfo', 'r') as f:
        for line in f:
            line = line.strip()
            if any(line.startswith(field) for field in ('MemFree', 'Buffers', 'Cached')):
                field, amount, unit = line.split()
                amount = int(amount)
                if unit != 'kB':
                    raise ValueError(
                        'Unknown unit {u!r} in /proc/meminfo'.format(u=unit))
                total += amount
    return total


def gen_change_in_memory():
    """
    http://stackoverflow.com/a/14446011/190597 (unutbu)
    """
    f = free_memory()
    diff = 0
    while True:
        yield diff
        f2 = free_memory()
        diff = f - f2
        f = f2
change_in_memory = gen_change_in_memory().next

def resize(arr, shape):
    print(change_in_memory())
    # 0
    np.require(arr, requirements=['OWNDATA'])

    time.sleep(1)
    print(change_in_memory())
    # 200

    arr.resize(shape, refcheck=False)

N = 10000000
b = Foo((N,), buffer = array.array('f',range(N)))
c = Foo((N,), buffer = array.array('f',range(N)))

产量

print(change_in_memory())
# 0

d = b+c
d = np.require(d, requirements=['OWNDATA'])

print(change_in_memory())
# 39136

resize(d, (24,))   # Increases memory by 200 KiB
time.sleep(1)
print(change_in_memory())
# -39116

转移numpy数据的所有权

3 个答案: