Question

我使用以下代码np.vectorize()遇到问题：

import bitstring as bs
import numpy as np

def get_bitstring(number, mode=None):
    """Get BitString based on internal representation of number."""
    if mode:
        return bs.pack(mode, number)
    double = (float, np.float64)
    single = (np.float32)
    if isinstance(number, double):
        mode = '>d'
    elif isinstance(number, single):
        mode = '>f'
    else:
        raise Exception("Unknown type")
    return bs.pack(mode, number)

def vec_get_bitstring(arr):
    vec = np.vectorize(get_bitstring, otypes=[bs.BitStream])
    return vec(arr)

# Testarray
arr = np.array([np.float32(1),np.float32(2)], dtype=np.float32)

这些是我得到的结果：

[get_bitstring(x,) for x in arr]
# >> [BitStream('0x3f800000'), BitStream('0x40000000')]

vec_get_bitstring(are)
# >> array([BitStream('0x3ff0000000000000'),   BitStream('0x4000000000000000')], dtype=object)

np.vectorize（）会在将数据提供给np.float32(x)之前将输入float(x)转换为get_bitstring(x)。当然，这会看到float并返回64位数字。这是为什么？为什么np.vectorize将我的输入dtype从np.float32(x)更改为float(x)？

Answer 1

vectorize是错误的工具。

普通列表理解（或迭代）给出了所需的元素类型：

In [66]: arr = np.array([1,2], dtype=np.float32)
In [67]: arr
Out[67]: array([ 1.,  2.], dtype=float32)
In [68]: [type(i) for i in arr]
Out[68]: [numpy.float32, numpy.float32]
In [70]: [type(i.item()) for i in arr]
Out[70]: [float, float]

vectorize旨在使广播数组成为一个只接受标量值的函数。当函数接受多个变量时，它最有用。我们必须深入研究其代码，以确切了解将元素转换为标量的确切位置和方式。这不是简单的代码;你通过使用它而失去了很多控制力。

请注意，vectorize不会提出任何速度声明。

也许您需要解释为什么使用vectorize。

如果我定义：

def foo(x):
    print(type(x))
    return x
vfoo = np.vectorize(foo)

看来第一个调用，即用于确定输出类型的调用（如果未定义），将获得numpy类型。但随后的所有都得到了底层的python类型：

In [12]: vfoo(np.array([1,2,3.],np.int8))
<type 'numpy.int8'>
<type 'int'>
<type 'int'>
<type 'int'>
Out[12]: array([1, 2, 3], dtype=int8)

In [14]: vfoo(np.array(['string',2,3.],object))
<type 'str'>
<type 'str'>
<type 'int'>
<type 'float'>
Out[14]: 
array(['string', '2', '3.0'],  dtype='|S6')

vectorize代码中有关于将args转换为对象dtype数组的注释。我没有研究上下文，但我们可能会看到这种效果：

In [20]: type(np.array([1,2,3],np.float32)[0])
Out[20]: numpy.float32

In [21]: type(np.array([1,2,3],np.float32).astype(object)[0])
Out[21]: float

这是使用2d数组进行dtype对象操作的方法：

In [90]: x=np.arange(6).reshape(3,2)

In [91]: res=np.empty(x.shape,dtype=object)

In [92]: res.flat[:]=[type(i) for i in x.flat]

In [93]: res
Out[93]: 
array([[<type 'numpy.int32'>, <type 'numpy.int32'>],
       [<type 'numpy.int32'>, <type 'numpy.int32'>],
       [<type 'numpy.int32'>, <type 'numpy.int32'>]], dtype=object)

Answer 2

尝试将所需的数据类型（float32）添加到类型中。看来你必须准确指定你想要的输出类型。

np.vectorize（）将输入dtype np.float32更改为float

2 个答案: