Question

import numpy as np

data = np.array([['Height', 'Weight'],['165', '48'],['168', '50'],['173', '53']])
data[0,0] = data[0,0] + "_1"

数据[0,0] 是'高度'，我想用'Height_1'替换它。但上面的代码不起作用。它将结果返回为：

data[0,0]

'高度'

数据[0,0] 元素保持不变。如果我直接替换它而不参考它自己，它仍然无效。

data[0,0] = "Height" + "_1"

结果：

data[0,0]

'高度'

但如果我用“Height”以外的其他字符替换它，它就可以了。

data[0,0] = "str" + "_1"

结果：

data[0,0]

'str_1'

我用这个案子来解释我遇到的问题。在我的工作中，我必须引用数组本身，因为我需要替换不符合某些要求的元素。有人有解决方案吗？谢谢。

Answer 1

问题是您的数组是dtype('<U6')

>>> data = np.array([['Height', 'Weight'],['165', '48'],['168', '50'],['173', '53']])
>>> data.dtype
dtype('<U6')
>>>

会自动截断：

>>> data[0,0] = "123456789"
>>> data
array([['123456', 'Weight'],
       ['165', '48'],
       ['168', '50'],
       ['173', '53']], 
      dtype='<U6')
>>>

您可以在创建阵列时始终将dtype指定为“对象”，但这样可以消除numpy开始时的许多速度优势。

或者，您可以指定更长的字符串类型：

>>> data
array([['Height', 'Weight'],
       ['165', '48'],
       ['168', '50'],
       ['173', '53']], 
      dtype='<U20')
>>> data[0,0]='Height_1'
>>> data
array([['Height_1', 'Weight'],
       ['165', '48'],
       ['168', '50'],
       ['173', '53']], 
      dtype='<U20')
>>>

但是要小心，好像你设定的限制太长，你会浪费记忆力：

>>> data = np.array([['Height', 'Weight'],['165', '48'],['168', '50'],['173', '53'], ['42','88']], dtype='U20')
>>> data.nbytes
800
>>> data = np.array([['Height', 'Weight'],['165', '48'],['168', '50'],['173', '53'], ['42','88']], dtype='U6')
>>> data.nbytes
240

如果您只需要有限数量的字符，请考虑使用字节字符串（内存要求的1/4）：

>>> data = np.array([['Height', 'Weight'],['165', '48'],['168', '50'],['173', '53'], ['42','88']], dtype='S20')
>>> data.nbytes
200
>>>

Answer 2

为数组指定对象类型，例如：

a = np.array([['Height', 'Weight'],['165', '48'],['168', '50'],['173', '53']],dtype=object)

然后， a[0][0]+='_1'会做到这一点，你会得到：

array([['Height_1', 'Weight'],
       ['165', '48'],
       ['168', '50'],
       ['173', '53']], dtype=object)

Python：在这种情况下，为什么我不能为数组赋值？

2 个答案: