Question

我正在尝试将具有混合类型（具有不同数据结构）的二进制文件读入NumPy数组。

数据以* .dat文件和* .dict（包含数据字典的纯文本）组织。我有一个数据字典的例子如下：

＆＃34;名称＆＃34; ＆＃34; S＆＃34; ＆＃34; 50＆＃34;
＆＃34;项目ID＆＃34; ＆＃34; I＆＃34; ＆＃34; 4＆＃34;
＆＃34;量＆＃34; ＆＃34; F＆＃34; ＆＃34; 8＆＃34;

我的想法是创建一个类，我只需调用

来实例化并加载数据

f = data_bin()
f.load("profit.bin")

每当我混合使用整数和浮点数时，这段代码都能正常运行，但只要我在中间抛出一个字符串字段，它就会抛出错误信号。。
＆＃34; TypeError：需要浮点参数，而不是numpy.string _＆＃34;

我写的课程是吼叫作为旁注，我可以说我真的需要Numpy中的数据（为了性能和与现有代码原因的兼容性），但我可以忍受像python列表那样去Numpy。

我很感激任何帮助！

class data_bin:
  def __init__(self):
    self.datafile="No file loaded yet"
    self.dictfile="No file loaded yet"
    self.dictionary=None
    self.data=None

  def load(self, file):
    self.datafile = file
    self.dictfile=file[0:len(file)-3]+"dict"
    self.builds_dt()
    self.loads_data()

  def builds_dt(self):
    w=open(self.dictfile,'rb')
    w.readline()
    w.readline()
    q=w.readline()
    dt=[]
    while len(q)>0:
        a=q.rstrip().split(',')
        field_name=a[0]
        field_type=a[1]
        field_length=a[2]
        dt.append((field_name,field_type+field_length))
        q=w.readline()
    self.dictionary=dt

  def loads_data(self):
    f=open(self.datafile,'rb')
    self.data=np.fromfile(f, dtype=self.dictionary)

  def info(self):
    print "Binnary Source: ", self.datafile
    print "   Data Types:", self.dictionary
    try:
      print "   Number of records: ", self.data.shape[0]
    except:
      print "   No valid data loaded"

Answer 1

如果你真的想要按行将所有数据存储在一个numpy数组中，你可以。例如：

myArr = np.array([1, 2.5, 'hello', {'a':7}], dtype='O')
--> array([1, 2.5, 'hello', {'a': 7}], dtype=object)

这将创建一个numpy对象数组。因为Python中的所有东西都是一个对象，所以它有效。不幸的是，你失去了很多理由首先拥有一个numpy数组。如果你需要对数据进行计算，我建议按数据类型分离它们并从那里开始工作（例如，基于第二列解析，或者使用np.where结合np.take和来自np.loadtxt的重新排序））。否则，我建议坚持使用python列表或类似的东西。

话虽如此，许多功能仍然有效：

e.g., 
myArr = np.append(myArr, ('what?', 5.2))
--> array([1, 2.5, 'hello', {'a': 7}, 'what?', '5.2'], dtype=object)

myArr.reshape(2,3)
--> array([[1, 2.5, 'hello'],
           [{'a': 7}, 'what?', '5.2']], dtype=object)

如果我错过了你想要的东西，请告诉我。

使用Numpy读取混合类型二进制数据（字符，浮点数和整数）

1 个答案: