我有一个二进制文件,我可以在MATLAB中打开,但无法在Python中打开。二进制文件被编码为'双浮点',因此由MATLAB读取,其中包含以下行:
fread(fopen(fileName), 'float64');
在Python中,我不确定如何复制这一行。我认为使用Numpy将是一个很好的起点,所以我尝试了以下几行,但没有得到我预期的输出。每行有6个数字,我只得到第一个和一个'NaN'。
from numpy import *
f = open('filename', 'rb')
a = fromfile(f, double64, 10)
print a
对此的任何帮助都将非常感激;我在下面的评论中发布了二进制文件和MATLAB解析文件。我也不需要特别使用Numpy,我对任何基于Python的解决方案都持开放态度。谢谢。
答案 0 :(得分:6)
每秒的值都是nan
,所以这可能是一些分隔符。此外,文件中的值是列优先。以下脚本读入数据,抛出NaN条目,将数组操作为正确的形状,并输出与您发布的文件相同的CSV文件:
import csv
import numpy as np
# Pull in all the raw data.
with open('TEMPO3.2F-0215_s00116.dat', 'rb') as f:
raw = np.fromfile(f, np.float64)
# Throw away the nan entries.
raw = raw[1::2]
# Check its a multiple of six so we can reshape it.
if raw.size % 6:
raise ValueError("Data size not multiple of six.")
# Reshape and take the transpose to manipulate it into the
# same shape as your CSV. The conversion to integer is also
# so the CSV file is the same.
data = raw.reshape((6, raw.size/6)).T.astype('int')
# Dump it out to a CSV.
with open('test.csv', 'w') as f:
w = csv.writer(f)
w.writerows(data)
修改:更新版本,其中包含jorgeca建议的更改:
import csv
import numpy as np
# Pull in all the raw data.
raw = np.fromfile('TEMPO3.2F-0215_s00116.dat', np.float64)
# Throw away the nan entries.
raw = raw[1::2]
# Reshape and take the transpose to manipulate it into the
# same shape as your CSV. The conversion to integer is also
# so the CSV file is the same.
data = raw.reshape((6, -1)).T.astype('int')
# Dump it out to a CSV.
with open('test.csv', 'w') as f:
w = csv.writer(f)
w.writerows(data)
答案 1 :(得分:4)
在数据值之间存在分隔符,在读取时产生交替数据和NaN,例如在matlab中:
NaN
2134
NaN
2129
NaN
2128
....
1678
和numpy:
[ nan 2134. nan ..., 1681. nan 1678.]
我使用您使用Matlab或numpy(1.7)发布的代码获得相同的输入。请注意,根据csv文件中的模式,数据是按列逐列读取的,而不是按行读取的。
要获得numpy中的所有数据,请尝试
a = fromfile(file=f, dtype=float64, count=-1)