Question

我有很多数据存储在结构化的numpy数组中，例如：

In []: data
Out[]: 
array([(1.0, 1.001, 1.002, 1.003), (2.0, 2.001, 2.002, 2.003),
       (3.0, 3.001, 3.002, 3.003), (4.0, 4.001, 4.002, 4.003)], 
      dtype=[('f3', '<f8'), ('f0', '<f8'), ('f1', '<f8'), ('f2', '<f8')])

我的数据中功能的顺序不一致。要处理数据，我想根据字段名称对数组进行排序，即：

order = ['f0', 'f1', 'f2', 'f3']

根据字段名称排序使用：

执行

data = data[order]

这提供了预期的结果：

In []: data
Out[]:
array([(1.001, 1.002, 1.003, 1.0) (2.001, 2.002, 2.003, 2.0)
       (3.001, 3.002, 3.003, 3.0) (4.001, 4.002, 4.003, 4.0)]
      dtype=[('f0', '<f8'), ('f1', '<f8'), ('f2', '<f8'), ('f3', '<f8')])

但是，在我研究如何进行排序的过程中，我多次尝试使用numpy.take来避免复制（例如sorting numpy structured and record arrays is very slow）。

有没有办法对我的问题进行就地排序？我的方法不起作用，因为numpy.take期望整数作为索引。请记住，我尝试根据字段名称和而不是对数据中包含的值进行排序，因此我认为通常numpy.sort也不会工作。

In []: np.take(data, order, out=data)
Out[]: 
Traceback (most recent call last):
File "C:/PycharmWorkspace/test.py", line 30, in <module>
np.take(data, [int('f0'), int('f1'), int('f2'), int('f3')], out=data)
ValueError: invalid literal for int() with base 10: 'f0'

按字段名称对结构化numpy数组进行就地排序

0 个答案: