Question

我有一个列表的python列表，可以转换为numpy数组。我已经为numpy数组定义了dtype。某些数组值可能为None或''。如果numpy数组各自的dtype值为float或int，则会发出错误。如果值为None或''，有没有办法说numpy为特定dtype分配1（或字段指定值）。

例如：以下代码给出了错误

ValueError：无法将字符串转换为float。

    import re
    import numpy as np
    dt = np.dtype([ ('a', np.int32,),
          ('b', np.float32),
          ('c', np.int32,),
          ('d', np.float32),
          ('e', np.int32,),
      ]) 
     npar = np.array(('667000', '0', '0', '', ''), dt)

npar的预期输出为:( d为0.0，e为默认值1）

    (667000, 0.0, 0, 0.0, 1)

我有要转换的大型多维数组。因此，性能是一件值得考虑的事情。

Answer 1

numpy.lib.npyio.loadtxt函数有converters选项。

让data2.txt为：

667000;0;0;;;
668000;0;0;3;6;

在u=loadtxt('data2.txt',dtype=dt,delimiter=';',converters={3: lambda s :float32(s or 0),4: lambda s :int32(s or 1)}),之后：

array([(667000, 0.0, 0, 0.0, 1), (668000, 0.0, 0, 3.0, 6)], dtype=...)

缺乏价值替代。

Answer 2

这可能有效：

One liner:

s = ('667000', '0', '0', '', '')
npar = np.array(tuple([0 if dt.names[x]== 'd' else 1 if dt.names[x]=='e' else s[x] for x in range(0,len(s))]),dt)

Or：

import numpy as np
dt = np.dtype([ ('a', np.int32,),
          ('b', np.float32),
          ('c', np.int32,),
          ('d', np.float32),
          ('e', np.int32,),
])
s = ('667000', '0', '0', '', '')
t = np.array(s)
if not t[4]:
    t[4] = 1
t[t==''] = 0
npar = np.array(tuple(t),dt)

numpy数组指定默认值

2 个答案: