Question

让我们说这个初始的numpy数组有一些固定的dtype：

array = numpy.array([(1, 'a'), (2, 'b')],
                    numpy.dtype([('idfield',numpy.int32),
                                 ('textfield', '|S256')]))

现在我需要在for循环中填充这个数组，所以我这样做

for val in value:
    array = np.append(array, np.array([(val[0],val[1])],numpy.dtype([('idfield',numpy.int32),
                                                                     ('textfield', '|S256')])),axis=0)

它有效，但它看起来并不好看！我需要在for循环中重新指定dtype，即使它的逻辑是我将使用相同的dtype来填充我的数组。

您是否知道实现此操作的更简单方法？

Answer 1

np.append是np.concatenate

的简单封面

def append(arr, values, axis=None):
    arr = asanyarray(arr)
    if axis is None:
        if arr.ndim != 1:
            arr = arr.ravel()
        values = ravel(values)
        axis = arr.ndim-1
    return concatenate((arr, values), axis=axis)

In [89]: dt = np.dtype('U5,int')
In [90]: arr = np.array([('one',1)], dtype=dt)
In [91]: np.append(arr, ('two',2))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-91-bc17d9ad4a77> in <module>()
----> 1 np.append(arr, ('two',2))
 ...
-> 5166     return concatenate((arr, values), axis=axis)

TypeError: invalid type promotion

在这种情况下它确实

In [92]: np.ravel(('two',2))
Out[92]: array(['two', '2'], dtype='<U3')

将元组转换为2元素字符串dtype数组。现在concatenate尝试加入一个dt数组与U3数组，但它不能。 append中没有任何内容使用arr.dtype作为将values转换为数组的基础。你需要自己做。 numpy只能做很多事情来推断你的意图。 :)

因此，如果你指定它共有的dtype：

In [93]: np.append(arr, np.array(('two',2),dt))
Out[93]: array([('one', 1), ('two', 2)], dtype=[('f0', '<U5'), ('f1', '<i4')])

我不喜欢append因为新用户经常滥用它。通常他们认为它是一个列表附加克隆，它不是。

但它确实有一个优势 - 它促进了0d输入的维度：

In [94]: np.concatenate([arr, np.array(('two',2),dt)])
...
ValueError: all the input arrays must have same number of dimensions

使第二个阵列1d起作用：

In [95]: np.concatenate([arr, np.array([('two',2)],dt)])
Out[95]: array([('one', 1), ('two', 2)], dtype=[('f0', '<U5'), ('f1', '<i4')])

append隐藏了concatenate所需的维度调整。

但是在可能的情况下，最好只创建一个数组（或元组）列表并执行concatenate：

In [96]: alist = [('one',1),('two',2),('three',3)]
In [97]: ll = [np.array([x],dt) for x in alist]
In [98]: ll
Out[98]: 
[array([('one', 1)], dtype=[('f0', '<U5'), ('f1', '<i4')]),
 array([('two', 2)], dtype=[('f0', '<U5'), ('f1', '<i4')]),
 array([('three', 3)], dtype=[('f0', '<U5'), ('f1', '<i4')])]

In [100]: np.concatenate(ll)
Out[100]: 
array([('one', 1), ('two', 2), ('three', 3)],
      dtype=[('f0', '<U5'), ('f1', '<i4')])

但是直接从元组列表中创建数组会更好：

In [101]: np.array(alist, dt)
Out[101]: 
array([('one', 1), ('two', 2), ('three', 3)],
      dtype=[('f0', '<U5'), ('f1', '<i4')])

Answer 2

就像@ juanpa.arrivillaga评论的那样，只有一次定义你的dtype更简洁：

array_dt = np.dtype([
    ('idfield', np.int32),
    ('textfield', '|S256')
])

然后将第二个值列表定义为数组，然后连接

array2 = np.array(value, array_dt)                                     
array = np.concatenate([array, array2])

如何使用特定的dtype填充现有的numpy数组

2 个答案: