Question

我想专门使用os模块来处理读/写二进制文件。在读取数据类型超过1个字节的值时遇到问题，例如int64，float32，...等。为了说明我的问题，让我们看一下我写的以下示例。我生成np.float64类型的随机值，每个值8个字节：

# Write
n = 10
dim = 2
fd = os.open('test.dat', os.O_CREAT | os.O_WRONLY)
data_w = np.random.uniform(low=0.5, high=13.3, size=(n,dim)).astype(np.float64)
print("Written Data are:\n%s\n" % data_w)
os.write(fd, data_w.tobytes())
os.close(fd)
print("------------------ \n")

# Read
start_read = 0  # 0 for now. Later I can read from any row!
total_num_to_read = n*dim
fd = os.open('test.dat', os.O_RDONLY)
os.lseek(fd, start_read, 0)  # start_read from the beginning 0
raw_data = os.read(fd, total_num_to_read)  # How many values to be read
data_r = np.fromiter(raw_data, dtype=np.float64).reshape(-1, dim)
print("Data Read are:\n%s\n" % data_r)
os.close(fd)

阅读不正确。看看它是如何返回的：

Written Data are:
[[ 2.75763292  9.87883101]
 [ 1.73752327  9.9633879 ]
 [ 1.01616811  1.81174597]
 [ 9.93904659 10.6757686 ]
 [ 7.02452029  2.68652109]
 [ 5.29766028 11.15384409]
 [ 4.12499766 10.37214532]
 [11.75811252  3.30378401]
 [ 1.72738203  2.11228277]
 [ 7.7321937  11.64298051]]

------------------ 

Data Read are:
[[250.  87.]
 [227. 216.]
 [161.  15.]
 [  6.  64.]
 [162. 178.]
 [ 59.  35.]
 [246. 193.]
 [ 35.  64.]
 [218.  97.]
 [ 81.  50.]]

我无法正确检索！我认为np.fromiter(raw_data, dtype=np.float64).reshape(-1, dim)应该照顾它，但我不知道问题出在哪里。在这种情况下，如果我知道它具有特定的数据类型（即np.float64），我怎样才能读取二进制数据？

Answer 1

您应该使用np.fromstring(raw_data)代替fromiter()。检查文档以了解每个功能的用途。此外，从文件中读取时，请阅读正确的个字节数 !!! ：8* total_num_to_read。

In [103]: # Write
     ...: n = 10
     ...: dim = 2
     ...: fd = os.open('test.dat', os.O_CREAT | os.O_WRONLY)
     ...: data_w = np.random.uniform(low=0.5, high=13.3, size=(n,dim)).astype(np.float64)
     ...: print("Written Data are:\n%s\n" % data_w)
     ...: os.write(fd, data_w.tobytes())
     ...: os.close(fd)
     ...: print("------------------ \n")
     ...: 
     ...: # Read
     ...: start_read = 0  # 0 for now. Later I can read from any row!
     ...: total_num_to_read = n*dim
     ...: fd = os.open('test.dat', os.O_RDONLY)
     ...: os.lseek(fd, start_read, 0)  # start_read from the beginning 0
     ...: raw_data = os.read(fd, 8*total_num_to_read)  # How many values to be read
     ...: data_r = np.fromstring(raw_data, dtype=np.float64).reshape(-1, dim)
     ...: print("Data Read are:\n%s\n" % data_r)
     ...: os.close(fd)
     ...: 
     ...: 
Written Data are:
[[ 11.2465988    5.45304778]
 [ 12.06466331   9.95717255]
 [  7.35402895   1.68972606]
 [  0.7259652    1.01265826]
 [  3.11340311   2.44725153]
 [  2.82109715   5.02768335]
 [ 12.69054614   9.26028537]
 [  5.13785639   2.0780649 ]
 [  4.6796513    4.24710598]
 [  2.34859141   8.87224674]]

------------------ 

Data Read are:
[[ 11.2465988    5.45304778]
 [ 12.06466331   9.95717255]
 [  7.35402895   1.68972606]
 [  0.7259652    1.01265826]
 [  3.11340311   2.44725153]
 [  2.82109715   5.02768335]
 [ 12.69054614   9.26028537]
 [  5.13785639   2.0780649 ]
 [  4.6796513    4.24710598]
 [  2.34859141   8.87224674]]

读取float

1 个答案: